***** infoCopter.com/perl *****
Character Encoding and Safe Queries
Perl offers built-in functions for these kind of needs
[ Binary-Arithmetics ]
Consider following code:
print "hex FF = ", hex FF, "\n",
"unpack(B8,'A') = ", unpack(B8,'A'), "\n",
"unpack(H8,'HALLO') = " , unpack(H8,'HALLO'), "\n",
"unpack(C, 'A') = ", unpack(C, 'A'), "\n";
Output would be:
hex FF = 255
unpack(B8,'A') = 01000001
unpack(H8,'HALLO') = 48414c4c
unpack(C, 'A') = 65
Most wanted conversions
Decimal to Hex
255 -> FF |
$decimal = 3456;
$hex = &dec2hex($decimal);
print "decimal $decimal = '$hex'\n";
sub dec2hex($) { return sprintf("%lx", $_[0]) }
|
Char to Hex
ABC -> 41 42 43 |
my $hex = &iso2hex('ABC');
print "iso 'ABC' = '$hex'\n";
sub iso2hex($) {
my $hex = '';
for (my $i = 0; $i < length($_[0]); $i++) {
my $ordno = ord substr($_[0], $i, 1);
$hex .= sprintf("%lx ", $ordno);
}
$hex =~ s/ $//;;
$hex;
}
sub iso2hex_new($) {
my $hex = '';
for (my $i = 0; $i < length($_[0]); $i++) {
my $ordno = ord substr($_[0], $i, 1);
my $hx = sprintf("%lx ", $ordno);
$hx = "0$hx" if length($hx) < 3;
$hex .= $hx;
}
$hex =~ s/ $//;
$hex;
}
See also: iso2hex
|
Hex to ISO
41 42 43 -> "ABC" |
sub hex2iso ($) {
my $iso = '';
(my $hex = $_[0]) =~ tr/ //d;
for (my $i = 0; $i < length($hex) + 1; $i += 2) {
my $char = pack('H8', substr($hex, $i, 2));
$iso .= $char;
}
$iso;
}
sub hex2iso_new ($) {
my $iso = '';
(my $hex = $_[0]) =~ tr/ //d;
for (my $i = 0; $i < length($hex); $i += 2) {
my $char = pack('H8', substr($hex, $i, 2));
$iso .= substr($char, 0, 1);
}
$iso;
}
|
Decimal to Binary
255 -> 11111111 |
$decimal = 153;
$binmode = unpack('B8', pack('C', $decimal));
print "Decimal $decimal = '$binmode'\n";
# <-- Decimal 153 = '10011001'
or better
sub dec2bin($) { return sprintf("%b", $_[0]) }
or with leading Zeroes
sub dec2bin($) {
my $bin = sprintf("%b", $_[0]);
my $padding = 0;
$padding = 8 - length($bin) % 8 if length($bin) % 8;
return substr('00000000', 0, $padding) . $bin;
}
|
Hexadecimal to Decimal
FF -> 255 |
sub hex2dec($) {
eval "return sprintf(\"\%d\", 0x$_[0])";
}
or shorter:
sub hex2dec($) { return hex $_[0] }
|
| Decimal to Addends |
my $res = &dec2addends( number => $ARGV[0] || 153 );
foreach ( @{$res->{'add_numbers'}} ) {
print "------> $_\n";
}
sub dec2addends (%) {
my %args = @_;
my %hash = (); # init
$hash{'bin'} = dec2bin($args{'number'});
my $factor = 2 ** (length($hash{'bin'}) - 1);
my @add_numbers = ();
for (my $i = 0; $i < length($hash{'bin'}); $i++) {
my $add_number = substr($hash{'bin'}, $i, 1) * $factor;
push(@add_numbers, $add_number) if $add_number;
$factor /= 2;
}
$hash{'add_numbers'} = \@add_numbers;
\%hash;
}
e.g. Decimal 1000:
bin = '1111101000'
add_numbers = 'ARRAY(0x806552c)'
------> 512
------> 256
------> 128
------> 64
------> 32
------> 8
|
"Royce Kemp" wrote in message
news:b5923048.0201121758.6192b156@posting.google.com...
> I want to be able to take ASCII character strings like the ones shown
> below and store them into a file converted to binary...but not the
> ASCII binary representation....rather the literal binary.
>
> 80000000000000000000000000000000
> 66e94bd4ef8a2c3b884cfa59ca342b2e
>
> So, 80000000000000000000000000000000 would be stored as
>
> 1000 0000 0000 0000 0000 ....
>
> and 66e94bd4ef8a2c3b884cfa59ca342b2e would be stored as
>
> 1010 1010 1110 1001 0010 ....
>
> is there a way to use the pack/unpack functions to do this for me? if
> not, how can i control precisely what binary values are written to the
> file.
>
> thanks in advance.
> -r
URI encoding
URI::Escape - Escape and unescape unsafe characters
This module provides functions to escape and unescape URI strings as
defined by RFC 2396 (and updated by RFC 2732). URIs consist of a
restricted set of characters, denoted as "uric" in RFC 2396. The
restricted set of characters consists of digits, letters, and a few
graphic symbols chosen from those common to most of the character
encodings and input facilities available to Internet users.
use URI::Escape;
my $safe = uri_escape("10% is enough\n");
print "safe = '$safe'\n"; # -- 10%25%20is%20enough
Inline Subroutine of uri_escape
If you couldn't install the related CPAN module:
#!/usr/bin/perl -w
use strict;
print &uri_escape($ARGV[0]), "\n";
sub uri_escape {
my $text = $_[0];
return undef unless defined $text;
# Build a char to hex map
my %escapes = ();
for (0..255) {
$escapes{chr($_)} = sprintf("%%%02X", $_);
}
# Default unsafe characters. RFC 2732 ^(uric - reserved)
$text =~ s/([^A-Za-z0-9\-_.!~*'()])/$escapes{$1}/g;
$text;
}
Transforming Character Sets
/usr/bin/iconv -f "UTF-8" -t "ISO-8859-1" temp_file
Web Encoding to ISO-8859-1
ä =
ä
ö =
ö
ü =
ü
my $string = 'Zürich fände ich schön';
while ($string =~ /\\d+;/) {
my $tmp_string = $string;
$tmp_string =~ s/.*\(\d+);.*/$1/;
$char = pack(C, $tmp_string);
$string =~ s/\$tmp_string;/$char/g;
}
print "asc = '$string'\n";
|