Following sub routine will return a Browser-safe encoded version of a given string.
E.g. From Zürich to Zürich
sub web_enc ($) {
my $enc = '';
for (my $i = 0; $i < length($_[0]); $i++) {
my $ordno = ord substr($_[0], $i, 1);
$enc .= $ordno > 127 ? sprintf("%d;", $ordno) : substr($_[0], $i, 1);
}
$enc =~ s/ $//;;
$enc;
}
Web-encoding (generic)
This code converts special characters to web-encoded characters independent of
the character translation setting of the host (ISO or UTF).
#!/usr/bin/perl -w
use strict;
use Unicode::String qw(latin1 utf8);
open(IN, "<umlaute.txt") or print STDERR $!;
while(my $in = <IN>) {
print "--> $in";
my $text_iso = (utf8($in))->latin1;
my $text_utf8 = (latin1($text_iso))->utf8; # reverse check
my $input = $text_iso;
$input = $in if $in ne $text_utf8; # Is ISO already!
print '<-- ', &web_enc($input);
}
close IN;
sub web_enc ($) {
my $enc = '';
for (my $i = 0; $i < length($_[0]); $i++) {
my $ordno = ord substr($_[0], $i, 1);
$enc .= $ordno > 127 ? sprintf("&#%d;", $ordno) : substr($_[0], $i, 1);
}
$enc =~ s/ $//;;
$enc;
}
See also:
Char encoding
To utf
Toiso
Url encoding (legacy)