Perl: Difference between revisions
(Functions and Modules) |
No edit summary |
||
Line 15: | Line 15: | ||
** http://perldoc.perl.org/perl.html |
** http://perldoc.perl.org/perl.html |
||
** http://perldoc.perl.org/perlintro.html |
** http://perldoc.perl.org/perlintro.html |
||
** http://faq.perl.org/ |
|||
** http://perldoc.perl.org/index-tutorials.html |
** http://perldoc.perl.org/index-tutorials.html |
||
** http://perldoc.perl.org/index-modules-A.html |
** http://perldoc.perl.org/index-modules-A.html |
||
* '''[http://faq.perl.org/ FAQ]''' |
|||
* Most relevant perldoc manpages (from the FAQ): |
|||
: The [http://faq.perl.org/ FAQ] is the primary source of answer to questions like ''How can I do...''. |
|||
* '''Manpages''' - List of highly recommended perldoc manpages (from the FAQ). |
|||
{{pl2| |
{{pl2| |
||
Basics [http://perldoc.perl.org/perldata.html perldata], [http://perldoc.perl.org/perlvar.html perlvar], [http://perldoc.perl.org/perlsyn.html perlsyn], [http://perldoc.perl.org/perlop.html perlop], [http://perldoc.perl.org/perlsub.html perlsub] |
Basics [http://perldoc.perl.org/perldata.html perldata], [http://perldoc.perl.org/perlvar.html perlvar], [http://perldoc.perl.org/perlsyn.html perlsyn], [http://perldoc.perl.org/perlop.html perlop], [http://perldoc.perl.org/perlsub.html perlsub] |
||
Line 44: | Line 45: | ||
== Quick Introduction == |
== Quick Introduction == |
||
=== Program Structure === |
=== Program Structure === |
||
Example of a simple Hello World program: |
|||
<source lang="perl"> |
<source lang="perl"> |
||
#!/usr/bin/perl |
#!/usr/bin/perl |
||
use strict; # Immediately stops on potential problem |
use strict; # Immediately stops on potential problem - highly recommended for simplified debugging |
||
use warning; # Warnings |
use warning; # Warnings - highly recommended for simplified debugging |
||
print "Hello, World!\n"; |
print "Hello, World!\n"; |
||
Line 55: | Line 56: | ||
exit 0; |
exit 0; |
||
</source> |
</source> |
||
=== Data Types === |
|||
TBC |
|||
{| |
|||
==== Example with modules ==== |
|||
|- |
|||
TBC |
|||
|'''<code>$</code>'''||for scalar values (number, string or reference) |
|||
=== Data === |
|||
|- |
|||
==== Scalar ==== |
|||
|'''<code>@</code>'''||for arrays |
|||
==== Arrays ==== |
|||
|- |
|||
==== Maps ==== |
|||
|'''<code>%</code>'''||for hashes (associative arrays) |
|||
==== References ==== |
|||
|- |
|||
|'''<code>&</code>'''||for subroutines (aka functions, procedures, methods) |
|||
|- |
|||
|'''<code>*</code>'''||for all types of that symbol name. In version 4 you used them like pointers, but in modern perls you can just use references. |
|||
|- |
|||
|'''<code><></code>'''||are used for inputting a record from a filehandle. |
|||
|- |
|||
|'''<code>\</code>'''||takes a reference to something. |
|||
|} |
|||
Note that the last 2 are not really type specifiers. |
|||
=== Arrays === |
|||
Some example |
|||
<source lang="perl"> |
<source lang="perl"> |
||
my @array1 = ("titi","tutu"); # (...) is an array constructor |
|||
$tab{'somekey'} = '...'; |
|||
my @array2 = ("tata","toto"); |
|||
push(@array1,"tete"); # Append an element to an array |
|||
push(@array1,@array2); # Append another array to an array |
|||
</source> |
|||
Arrays can be easily constructed through '''autovivification'''. Below we create a hash of arrays |
|||
process(\$tab); |
|||
<source lang="perl"> |
|||
my %Projects; # Projects is a hash, but we say nothing on the types of its elements... |
|||
sub process () |
|||
foreach my $VOBName (keys %VOBs) |
|||
{ |
{ |
||
my $ProjectName = $VOBs{$VOBName}{'ProjectName'}; |
|||
my $tab = $_[0]; |
|||
push(@{$Projects{$ProjectName}}, $VOBName); # <-- we dereference value returned by $Projects{$ProjectName} as |
|||
$tab->{'somekey'} = '...'; |
|||
} # an array, hence creating automatically an array if undef |
|||
} |
|||
</source> |
</source> |
||
Below some difference of handling '''<code>@</code>''' in SCALAR or LIST context: |
|||
==== String ==== |
|||
<source lang="perl"> |
<source lang="perl"> |
||
# RESULT CONTEXT EXPLANATION |
|||
# Concat 2 strings |
|||
my @a = ("titi","tutu"); |
|||
$stringC = $stringA . $stringB; |
|||
my $varnoquote=@a; print "$varnoquote\n"; # "2" (SCALAR - @_ is evaluated in scalar context) |
|||
$stringC = "$stringA$stringB"; |
|||
my $varquote="@a"; print "$varquote\n"; # "titi tutu" (EXPAND - @_ is quote-expanded, each item being separated by space) |
|||
$stringC = join('', ($stringA, $stringB)); |
|||
print @a; print"\n"; # "tititutu" (LIST - $, is empty) |
|||
print(@a); print"\n"; # "tititutu" (LIST - $, is empty) |
|||
printf @a; print"\n"; # "titi" (LIST - 1st element in list is interpreted as the format string) |
|||
printf(@a); print"\n"; # "titi" (LIST - 1st element in list is interpreted as the format string) |
|||
print @a,"\n"; # "tititutu" (LIST - $, is empty) |
|||
printf "%s\n",@a; # "titi" (LIST - only 1st element is read) |
|||
</source> |
</source> |
||
Set variable '''<code>$,</code>''' to modify the list separator used when printing arrays |
|||
=== If / For / While ... === |
|||
TBC |
|||
=== File and I/O === |
|||
==== -X ==== |
|||
The function '''[http://perldoc.perl.org/functions/-X.html -X]''' can be used for various test on the files, directories... similar to the ''test'' command in ''Bash'': |
|||
<source lang="perl"> |
<source lang="perl"> |
||
my @a = ("titi","tutu"); |
|||
print "The file exists\n" if -e "../somefile"; |
|||
$,="\n"; |
|||
print "The directory exists\n" if -d "../some/directory"; |
|||
print @a; |
|||
</source> |
</source> |
||
=== Hashes === |
|||
Use '''_''' to save a system call, like in: |
|||
Some example of hashes: |
|||
<source lang="perl"> |
<source lang="perl"> |
||
my %cities = ( #(...) is a hash constructor |
|||
stat($filename); |
|||
"US" => "Washington", |
|||
print "Readable\n" if -r _; |
|||
"GB" => "London" |
|||
print "Writable\n" if -w _; |
|||
); |
|||
print "Executable\n" if -x _; |
|||
print " |
print $cities{"US"},"\n"; |
||
print "Binary\n" if -B _; |
|||
my %hashofhash = ( #This is actually a hash of references to hash |
|||
"address" => {name => "US", city => "Washington" }, |
|||
"identity" => {firstname => "smith", lastname => "Smith" } ); |
|||
print $hashofhash{"address"}{"name"},"\n"; |
|||
print $hashofhash{"address"}->{"name"},"\n"; |
|||
</source> |
</source> |
||
Since 5.9.1, operators can be stacked: |
|||
:<source lang="perl" enclose="prevalid">print "writable and executable\n" if -f -w -x $file; # same as -x $file && -w _ && -f _</source> |
|||
Note that in LIST context, a hash is transformed into an array containing '''both the keys and values''' in the hash! |
|||
==== getcwd / abs_path ==== |
|||
The function '''[http://perldoc.perl.org/Cwd.html getcwd]''' returns the current working directory. '''abs_path''' transforms a given relative path into its equivalent canonical absolute form. |
|||
<source lang="perl"> |
<source lang="perl"> |
||
my %myhash = ( key1 => "value1", key2 => "value2" ); |
|||
use Cwd; |
|||
my @myarray= ( "element1", "element2" ); |
|||
my $dir = getcwd(); |
|||
push (@myarray, %myhash); |
|||
use Cwd 'abs_path'; |
|||
my $abs_path = abs_path($file); |
|||
</source> |
|||
$, = ","; |
|||
== Functions == |
|||
print @myarray; # outputs "element1, element2, key2, value2, key1, value1" |
|||
See [http://perldoc.perl.org/index-functions.html] for a detailed list of Perl functions. |
|||
== Modules == |
|||
See [http://perldoc.perl.org/index-modules-A.html Core Modules] for a detailed list of Perl modules. Here a list of frequently used ones: |
|||
TBC |
|||
== One-Liner == |
|||
See [http://sial.org/howto/perl/one-liner/], [http://www.unixguide.net/unix/perl_oneliners.shtml], [http://www.catonmat.net/blog/perl-one-liners-explained-part-one/], [http://defindit.com/readme_files/perl_one_liners.html]. |
|||
<source lang="bash"> |
|||
perl -ne 'print unless /^$/../^$/' input # print lines, unless blank |
|||
perl -ne 'print if ! /^$/../^$/' input # reduce runs of blank lines to a single blank line |
|||
perl -nle 'print $.; close ARGV if eof' input input # $. need to be reset (by closing ARGV) between 2 input files |
|||
perl -nle 'print for m/\b(\S+)\b/g' paragraphs # print words from file paragraphs |
|||
perl -nle 'while(m/(\S+)\s+the\s+(\S+)/g){print "$1 $2"}' paragraphs # ... while loop needed when using multiple back-references |
|||
perl -lne 'print for /id is <(\d+)>/g' # match pattern and extract backreference |
|||
perl -lne 'print $2 for /id is <(\d+)> or <(\d+)>/g' # ... print 2nd matched backreference |
|||
cat oldfile | perl -pe 's/(\d+)_/sprintf("%2.2d_",$1)/e' > newfile # evaluate regex substitutions |
|||
</source> |
</source> |
||
== |
=== References === |
||
=== Miscellaneous examples === |
|||
<source lang="perl"> |
<source lang="perl"> |
||
my $VOBAttrRef = $VOBs{'AdminMask'}; # This return a reference to a Hash |
|||
# Various examples in Perl |
|||
my %VOBAttr = %$VOBAttrRef; # This dereference the reference above and return a Hash |
|||
die "can't run this"; |
|||
print $VOBAttr{'ProjectName'},"\n"; # We can use our new Hash variable |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
print $$VOBAttrRef{'ProjectName'},"\n"; # ... or we can dereference our reference variable using the $$ construct |
|||
#Split a multi-line variable/output in line components - method 1. |
|||
print $VOBAttrRef->{'ProjectName'},"\n"; # ... but -> can also be used to dereference |
|||
my @ArrayList = `$CT lsvob -short`; #any command producing a multi-line output |
|||
print $VOBs{'AdminMask'}->{'ProjectName'},"\n"; # We can also skip altoghether the reference variable |
|||
print $VOBs{'AdminMask'}{'ProjectName'},"\n"; # ... This notation is also available as a shortcut, -> can be omitted |
|||
</source> |
|||
Passing reference to sub-routines: |
|||
foreach (@ArrayList) |
|||
<source lang="perl"> |
|||
$tab{'somekey'} = '...'; |
|||
process(\$tab); |
|||
sub process () |
|||
{ |
{ |
||
my $tab = $_[0]; |
|||
chop(); #remove the trailing newline |
|||
$tab->{'somekey'} = '...'; |
|||
print "Array List - The VOB is $_.\n"; |
|||
} |
} |
||
</source> |
|||
Using Anonymous Hash References: |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
<source lang="perl"> |
|||
#Split a multi-line variable/output in line components - method 2. |
|||
#!/usr/bin/perl |
|||
my $ScalarList = `$CT lsvob -short`; #any command producing a multi-line output |
|||
use strict; |
|||
my @ArrayList2 = split/\n/,$ScalarList; #split the scalar into several lines |
|||
my @myarray; |
|||
foreach ( |
foreach my $iter ( 1..10 ) |
||
{ |
{ |
||
my $value1 = "value1_".$iter; |
|||
print "Scalar List - The VOB is $_.\n"; |
|||
my $value2 = "value2_".$iter; |
|||
}; |
|||
print "Creating our \$hashref... "; |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
my $hashref = { index1 => $value1, index2 => $value2 }; # { key1 => value1, ... } creates a REFERENCE to an anonymous hash. |
|||
# Use ${variable} to split scalar identifier from the rest of a text |
|||
# Since reference are SCALAR, we assign it to a scalar variable |
|||
my $variable; |
|||
print "Done.\n", |
|||
print "$variable_temp\n"; # NOK! Print a variable named variable_temp |
|||
" \$hashref: ",$hashref,"\n"; |
|||
print "${variable}_temp\n"; # OK! Print a $variable, followed by "_temp" |
|||
print " content: ",$$hashref{'index1'},",",$$hashref{'index2'},"\n"; |
|||
print "Adding \$hashref to our array... "; |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
push( @myarray, $hashref ); |
|||
#Append an array (or single element) to another |
|||
push(@array1,@array_or_element); |
|||
print "Done. There are currently ", scalar(@myarray), " elements in \@myarray.\n"; |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
print "Accessing last element of our array..."; |
|||
#Add use strict; at the beginning to improve compilation warnings. |
|||
print " content: @myarray[$#myarray], ${@myarray[$#myarray]}{'index1'} or better yet @myarray[$#myarray]->{'index2'}\n"; |
|||
use strict; |
|||
} |
|||
print "\n\nNow we will traverse our array again...\n"; |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
foreach ( @myarray ) |
|||
#Arrays / Hashes |
|||
{ |
|||
# |
|||
print "$_ containing ", |
|||
"index1 => $$_{'index1'},", |
|||
"index2 => $$_{'index2'}\n"; |
|||
print "... or using -> operator: ", |
|||
"index1 => $_->{'index1'},", |
|||
"index2 => $_->{'index2'}\n"; |
|||
} |
|||
</source> |
|||
=== String === |
|||
#The following actually adds six elements to array VOBRecords. It doesn't create an extra element contains the hash. |
|||
<source lang="perl"> |
|||
my %VOBAttrib = ( VOBName => $VOBName, IsProjectVOB => $IsProjectVOB, IsProjectAdminVOB => $IsProjectAdminVOB, ProjectName => $ProjectName ); |
|||
# Concat 2 strings |
|||
push (@VOBRecords, %VOBAttrib); |
|||
$stringC = $stringA . ucfirst($stringB); |
|||
$stringC = "$stringA$stringB"; |
|||
$stringC = join('', ($stringA, ucfirst($stringB))); |
|||
</source> |
|||
=== If / For / While ... === |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
TBC |
|||
# Handling reference |
|||
my $VOBAttrRef = $VOBs{'AdminMask'}; #This return a reference to a Hash |
|||
my %VOBAttr = %$VOBAttrRef; #This dereference the reference above and return a Hash |
|||
=== Operators === |
|||
print $VOBAttr{'ProjectName'},"\n"; #We can use our new Hash variable |
|||
==== Quote and quote-like operators ==== |
|||
print $$VOBAttrRef{'ProjectName'},"\n"; # ... or we can dereference our reference variable using the $$ construct |
|||
See [http://perldoc.perl.org/perlop.html#Quote-and-Quote-like-Operators perldoc] for detailed information. |
|||
print $VOBAttrRef->{'ProjectName'},"\n"; # ... but -> can also be used to dereference |
|||
print $VOBs{'AdminMask'}->{'ProjectName'},"\n"; #We can also skip altoghether the reference variable |
|||
print $VOBs{'AdminMask'}{'ProjectName'},"\n"; # ... This notation is also available as a shortcut, -> can be omitted |
|||
{| class="wikitable" |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
|- |
|||
#Autovivification - example on how to create a hash of array |
|||
!Customary!!Generic!!Meaning!!Interpolates |
|||
# |
|||
|- |
|||
|'''<code>''</code>'''||'''<code>q{}</code>'''||Literal||no |
|||
|- |
|||
|'''<code>""</code>'''||'''<code>qq{}</code>'''||Literal||yes |
|||
|- |
|||
|'''<code>``</code>'''||'''<code>qx{}</code>'''||Command||yes(*) |
|||
|- |
|||
|'''<code></code>'''||'''<code>qw{}</code>'''||Word list||no |
|||
|- |
|||
|'''<code>//</code>'''||'''<code>m{}</code>'''||Pattern match||yes(*) |
|||
|- |
|||
|'''<code></code>'''||'''<code>qr{}</code>'''||Pattern||yes(*) |
|||
|- |
|||
|'''<code></code>'''||'''<code>s{}{}</code>'''||Substitution||yes(*) |
|||
|- |
|||
|'''<code></code>'''||'''<code>tr{}{}</code>'''||Transliteration||no (but see below) |
|||
|- |
|||
|'''<code></code>'''||'''<code><<EOF</code>'''||here-doc||yes(*) |
|||
|} |
|||
::<small>(*) unless th delimiter is '''<code>''</code>'''.</small> |
|||
''Interpolates'' means that variables like '''<code>$VAR</code>''' are expanded, and that ''escaped sequence'' like '''<code>\n</code>''' are processed.<br/> |
|||
Also other delimiters can be used. For instance: |
|||
<source lang="perl"> |
|||
#Use any brackets |
|||
print q{Hello World}; |
|||
print q(Hello World); |
|||
print q[Hello World]; |
|||
print q<Hello World>; |
|||
#Brackets delimiters nest correctly, like |
|||
print q{Hello {my} World}; # Equivalent to 'Hello {my} World! |
|||
#We can use any non-whitespace character |
|||
print q!Hello World!; |
|||
print q|Hello World|; |
|||
print q#Hello World#; |
|||
</source> |
|||
Beware of some caveats: |
|||
my %Projects; |
|||
<source lang="perl"> |
|||
foreach my $VOBName (keys %VOBs) |
|||
$s = q{ if($a eq "}") ... }; # WRONG - } inside "}" is not nested, so quoting will stop there |
|||
{ |
|||
$s = q #Hello World# # WRONG - Because of the whitespace, #Hello World# is taken as a comment |
|||
my $ProjectName = $VOBs{$VOBName}{'ProjectName'}; |
|||
</source> |
|||
push(@{$Projects{$ProjectName}}, $VOBName); # <-- we dereference value returned by $Projects{$ProjectName} as |
|||
# an array, hence creating automatically an array if undef |
|||
} |
|||
=== Regular expressions === |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
Use '''<code>/regex/</code>''' or '''<code>m!regex!</code>''' (where <code>!</code> can be any quoting character).<br/> |
|||
#Read something from standard input |
|||
Use '''<code>=~</code>''' to match a given variable, otherwise '''<code>$_</code>''' is used. Use '''<code>!~</code>''' to reverse the sense of the match. |
|||
$line = <STDIN>; |
|||
$line = readline(*STDIN); # same thing |
|||
chop($line = <STDIN>); # remove trailing newline |
|||
==== Finding matches ==== |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
In '''SCALAR context''', '''<code>/regex/</code>''' returns true/false if matching is found |
|||
#Read one character} |
|||
<source lang="perl"> |
|||
#print "Press RETURN..."; |
|||
#$key = getc(); |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
# Regex Matching |
|||
# Use /regex/, or m!regex! (where ! can be any quoting character) |
|||
# Use =~ to match a given variable, otherwise $_ is used. Use !~ to reverse the sense of the match. |
|||
# SCALAR CONTEXT: // return true/false if matching found |
|||
$myvar =~ /World/ #scalar context, returns true if $myvar contains World |
$myvar =~ /World/ #scalar context, returns true if $myvar contains World |
||
/World/ #scalar context, same as above except that now it is $_ that is matched |
/World/ #scalar context, same as above except that now it is $_ that is matched |
||
"Hello World" =~ /World/ #scalar contect, same as above, to show that left member doesn't need to be an L-Value |
"Hello World" =~ /World/ #scalar contect, same as above, to show that left member doesn't need to be an L-Value |
||
</source> |
|||
==== Extracting matches ==== |
|||
The grouping metacharacters '''<code>()</code>''' also allow the extraction of the parts of a string that matched. For each grouping, the part that matched inside goes into the special variables '''<code>$1</code>''' , '''<code>$2</code>'''... They can be used just as ordinary variables: |
|||
<source lang="perl"> |
|||
# matched inside goes into the special variables $1 , $2 , etc. They can be used just as ordinary variables: |
|||
# extract hours, minutes, seconds |
# extract hours, minutes, seconds |
||
$time =~ /(\d\d):(\d\d):(\d\d)/; # match hh:mm:ss format |
$time =~ /(\d\d):(\d\d):(\d\d)/; # match hh:mm:ss format |
||
Line 239: | Line 283: | ||
$minutes = $2; |
$minutes = $2; |
||
$seconds = $3; |
$seconds = $3; |
||
</source> |
|||
In '''LIST context''', '''<code>/regex/</code>''' with groupings will return the list of matched values ($1,$2,...) . So we could rewrite the above as: |
|||
<source lang="perl"> |
|||
($hours, $minutes, $second) = ($time =~ /(\d\d):(\d\d):(\d\d)/); |
($hours, $minutes, $second) = ($time =~ /(\d\d):(\d\d):(\d\d)/); |
||
</source> |
|||
If the groupings in a regex are nested, '''<code>$1</code>''' gets the group with the leftmost opening parenthesis, '''<code>$2</code>''' the next opening parenthesis... For example, here is a complex regex and the matching variables indicated below it: |
|||
# etc. For example, here is a complex regex and the matching variables indicated below it: |
|||
# /(ab(cd|ef)((gi)|j))/; |
|||
# 1 2 34 |
|||
/(ab(cd|ef)((gi)|j))/; |
|||
#Associated with the matching variables $1 , $2 , ... are the backreferences \1 , \2 , ... Backreferences are matching variables |
|||
1 2 34 |
|||
==== Using back-references ==== |
|||
Associated with the matching variables '''<code>$1</code>''', '''<code>$2</code>'''... are the backreferences '''<code>\1</code>''', '''<code>\2</code>'''... Backreferences are matching variables that can be used inside a regex: |
|||
<source lang="perl"> |
|||
/(\w\w\w)\s\1/; # find sequences like 'the the' in string |
/(\w\w\w)\s\1/; # find sequences like 'the the' in string |
||
</source> |
|||
Note that '''<code>$1</code>''', '''<code>$2</code>'''.... should only be used outside of a regex, and '''<code>\1</code>''', '''<code>\2</code>'''... only inside a regex. |
|||
==== Search & Replace ==== |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
# Regex Search & Replace |
|||
Use '''<code>s/regex/replacement/modifiers</code>'''. Use '''<code>=~</code>''' to match a given variable, otherwise '''<code>$_</code>''' is used. |
|||
# Use =~ to match a given variable, otherwise $_ is used. |
|||
# SCALAR CONTEXT: s/// returns the number of matches, or false if no match. |
|||
In '''SCALAR''' context, '''<code>s///</code>''' returns the number of matches, or false if no match. |
|||
<source lang="perl"> |
|||
$x = "Time to feed the cat!"; |
$x = "Time to feed the cat!"; |
||
$x =~ s/cat/hacker/; # $x contains "Time to feed the hacker!" |
$x =~ s/cat/hacker/; # $x contains "Time to feed the hacker!" |
||
</source> |
|||
Note that the matching variablle '''<code>$1</code>''', '''<code>$2</code>''' can be used in the replacement string. |
|||
# VARIABLES: |
|||
# $1,$2: matched variables are immediately available in the replacement string. |
|||
# MODIFIERS: |
|||
# - g: find all matches |
|||
# - e: wraps an eval{...} around the replacement string and the evaluated result is substituted for the matched substring. |
|||
Some modifiers: |
|||
* '''<code>g</code>''' - Find all matches |
|||
* '''<code>e</code>''' - wraps an '''<code>eval{...}<code>''' around the replacement string and the evaluated result is substituted for the matched substring. Example: |
|||
{{pl2|<source lang="perl"> |
|||
# reverse all the words in a string |
# reverse all the words in a string |
||
$x = "the cat in the hat"; |
$x = "the cat in the hat"; |
||
$x =~ s/(\w+)/reverse $1/ge; # $x contains "eht tac ni eht tah" |
$x =~ s/(\w+)/reverse $1/ge; # $x contains "eht tac ni eht tah" |
||
</source>}} |
|||
==== The split operator ==== |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
'''<code>split /regex/, string</code>''' splits string into a list of substrings and returns that list. The regex determines the character sequence that string is split with respect to. For example, to extract a comma-delimited list of numbers, use |
|||
# The split operator |
|||
<source lang="perl"> |
|||
# split /regex/, string splits string into a list of substrings and returns that list. The regex determines the character sequence |
|||
# that string is split with respect to. For example, to extract a comma-delimited list of numbers, use |
|||
$x = "1.618,2.718, 3.142"; |
$x = "1.618,2.718, 3.142"; |
||
@const = split /,\s*/, $x; # $const[0] = '1.618' |
@const = split /,\s*/, $x; # $const[0] = '1.618', $const[1] = '2.718', $const[2] = '3.142' |
||
</source> |
|||
# $const[1] = '2.718' |
|||
# $const[2] = '3.142' |
|||
# If the empty regex // is used, the string is split into individual characters. If the regex has groupings, then the list produced |
|||
# contains the matched substrings from the groupings as well: |
|||
If the empty regex '''<code>//</code>''' is used, the string is split into individual characters. If the regex has groupings, then the list produced contains the matched substrings from the groupings as well: |
|||
<source lang="perl"> |
|||
$x = "/usr/bin"; |
$x = "/usr/bin"; |
||
@parts = split m!(/)!, $x; # $parts[0] = '' Since the first character of $x matched the regex, an initial element was prepended. |
@parts = split m!(/)!, $x; # $parts[0] = '' Since the first character of $x matched the regex, an initial element was prepended. |
||
# $parts[1] = '/' |
# $parts[1] = '/' The delimiter is also in the list because of the grouping (/) |
||
# $parts[2] = 'usr' |
# $parts[2] = 'usr' |
||
# $parts[3] = '/' |
# $parts[3] = '/' Yet a delimiter because of the grouping |
||
# $parts[4] = 'bin' |
# $parts[4] = 'bin' |
||
</source> |
|||
==== Lookahead / Lookbehind ==== |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
The lookahead and lookbehind assertions are generalizations of the anchor concept. Lookahead and lookbehind are zero-width assertions that let us specify which characters we want to test for. The '''lookahead''' assertion is denoted by '''<code>(?=regexp)</code>''' and the '''lookbehind''' assertion is denoted by '''<code>(?<=fixed-regexp)</code>'''. Some examples are |
|||
# assertions that let us specify which characters we want to test for. The lookahead assertion is denoted by (?=regexp) and the |
|||
# lookbehind assertion is denoted by (?<=fixed-regexp). Some examples are |
|||
<source lang="perl"> |
|||
$x = "I catch the housecat 'Tom-cat' with catnip"; |
|||
$x = "I catch the housecat 'Tom-cat' with catnip"; |
|||
$x =~ /cat(?=\s+)/; # matches 'cat' in 'housecat' |
|||
@catwords = ($x =~ /(?<=\s)cat\w+/g); # matches, $catwords[0] = 'catch' $catwords[1] = 'catnip' |
|||
$x =~ /\bcat\b/; # matches 'cat' in 'Tom-cat' |
|||
$x =~ /(?<=\s)cat(?=\s)/; # doesn't match; no isolated 'cat' in middle of $x |
|||
</source> |
|||
$x =~ /(?<=\s)cat(?=\s)/; # doesn't match; no isolated 'cat' in |
|||
# middle of $x |
|||
==== Grep / Map ==== |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
Use '''<code>grep</code>''' on a list to return the element of that list for which the expression is true. For instance |
|||
# Grep |
|||
<source lang="perl"> |
|||
@foo = grep(!/^#/, @bar); # Only returns line that are not comments |
|||
my @array = ("el1","gel2","el3","gel1","gel2"); |
my @array = ("el1","gel2","el3","gel1","gel2"); |
||
my @array2 = grep {s/(.*el)/reverse $1/e} @array; |
my @array2 = grep {s/(.*el)/reverse $1/e} @array; # grep may also modify the elements in the returned list |
||
</source> |
|||
Use '''<code>map</code>''' on a list to apply a given expression on all elements in the list. |
|||
$,="\n"; |
|||
print @array2; |
|||
<source lang="perl"> |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
@chars = map(chr, @nums); # Returns the list of character corresponding to the list of of numbers |
|||
# How to discard stderr on windows |
|||
</source> |
|||
# note: on windows, we use \nul instead of nul because each folder has it's own nul handler, and we want to reduce the number of |
|||
# used handle |
|||
=== File and I/O === |
|||
{| |
|||
|- |
|||
|'''chdir''' (function)||Change the current working directory |
|||
|- |
|||
|'''-X''' (function)||Various test on files, directories... pretty much like in ''Bash'' scripts. |
|||
|- |
|||
|'''getcwd''' (module ''CWD'')||Get the current working directory |
|||
|- |
|||
|'''abs_path'''||Transform a relative path into absolute path |
|||
|} |
|||
{| class="wikitable" |
|||
|- |
|||
|Read something from '''standard input''' |
|||
|<source lang="perl"> |
|||
$line = <STDIN>; |
|||
$line = readline(*STDIN); # same thing |
|||
chop($line = <STDIN>); # remove trailing newline |
|||
</source> |
|||
|- |
|||
|Read one character from STDIN |
|||
|<source lang="perl"> |
|||
print "Press RETURN..."; |
|||
$key = getc(); |
|||
</source> |
|||
|- |
|||
|System calls |
|||
|<source lang="perl"> |
|||
system "echo hello world!"; |
|||
system qq(echo hello world!); |
|||
system $MYCMD, qw(param1), 'the name is'.getname($index); |
|||
</source> |
|||
|- |
|||
|'''Discard STDERR''' on Windows / Linux. Note that on Windows, we use <tt>\nul</tt> because each folder as a nul handler and we want to reduce the number of used handle |
|||
|<source lang="perl"> |
|||
my $STDERRNULL = "2>\\nul"; #use this on windows |
my $STDERRNULL = "2>\\nul"; #use this on windows |
||
my $STDERRNULL = "2>/dev/null"; #use this on unix |
my $STDERRNULL = "2>/dev/null"; #use this on unix |
||
my @CTResults = qx(ls somedirectory $STDERRNULL); |
|||
my $AT="@"; |
|||
</source> |
|||
|- |
|||
my @CTResults = qx($CT lstype -local -fmt "%n \\\"%[type_scope]p\\\"\\n" lbtype:$labelName$AT$vobName $STDERRNULL); |
|||
|'''Capture STDOUT''' |
|||
|<source lang="perl"> |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
# here's a file-private function as a closure, |
|||
# callable as &$priv_func; it cannot be prototyped. |
|||
my $priv_func = sub { |
|||
# stuff goes here. |
|||
}; |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
# Capture command output: use back-ticks ``, qx(), or system("") with redirection. |
|||
my @ouput = `ls`; |
my @ouput = `ls`; |
||
my @ouput = qx(ls); |
my @ouput = qx(ls); |
||
system("ls >output.txt"); |
system("ls >output.txt"); |
||
</source> |
|||
|- |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
| Capture command exit status |
|||
|<source lang="perl"> |
|||
my $exit_status = system("del file.txt"); |
my $exit_status = system("del file.txt"); |
||
</source> |
|||
|- |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
| Temporarily disable STDERR and restore it afterwards |
|||
#Handling of @ in scalar / list context. |
|||
|<source lang="perl"> |
|||
# RESULT CONTEXT EXPLANATION |
|||
my @a = ("titi","tutu"); |
|||
my $varnoquote=@a; print "$varnoquote\n"; # "2" (SCALAR - @_ is evaluated in scalar context) |
|||
my $varquote="@a"; print "$varquote\n"; # "titi tutu" (EXPAND - @_ is quote-expanded, each item being separated by space) |
|||
print @a; print"\n"; # "tititutu" (LIST - $, is empty) |
|||
print(@a); print"\n"; # "tititutu" (LIST - $, is empty) |
|||
printf @a; print"\n"; # "titi" (LIST - 1st element in list is interpreted as the format string) |
|||
printf(@a); print"\n"; # "titi" (LIST - 1st element in list is interpreted as the format string) |
|||
print @a,"\n"; # "tititutu" (LIST - $, is empty) |
|||
printf "%s\n",@a; # "titi" (LIST - only 1st element is read) |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
#Give default value if no parameter in sub |
|||
sub myfunc |
|||
{ |
|||
my($suffix) = @_ ? "@_" : "defaultvalue"; |
|||
} |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
#Report a complete error message when loading a script |
|||
#This hack allows for printing a custom error message + file not found-like error message (given by $!) + syntax error messages (@_) |
|||
do "your script.pl" |
|||
or (print "Your error message\n$!\n" and die @_); |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
#Example on how to embed perl into a w2K shell script. |
|||
#Notice how the first rem is actually a multiline assignment to perl array variable @rem, where the value is quoted with ' '. |
|||
@rem= 'PERL for Windows NT - ccperl must be in search path |
|||
@echo off |
|||
ccperl %0 %1 %2 %3 %4 %5 %6 %7 %8 %9 |
|||
goto endofperl |
|||
@rem '; |
|||
# Your Perl code comes here |
|||
# End of Perl section |
|||
__END__ |
|||
:endofperl |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
#Example on use of qw (=''), join, etc to build complex line using variable and function calls (here vobName() is a function) |
|||
system "echo", $CT, qw(rmtype), join("",'trtype:MKELEM_POST_OWNER@',vobName($vobname)); |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
#Example on how to temporarily disable STDERR and restore it afterwards |
|||
open(SAVE_STDERR, '>&STDERR'); |
open(SAVE_STDERR, '>&STDERR'); |
||
close(STDERR) unless $ENV{CLEARCASE_TRACE_TRIGGERS}; |
close(STDERR) unless $ENV{CLEARCASE_TRACE_TRIGGERS}; |
||
Line 401: | Line 427: | ||
open(STDERR, '>&SAVE_STDERR'); |
open(STDERR, '>&SAVE_STDERR'); |
||
close(SAVE_STDERR); |
close(SAVE_STDERR); |
||
</source> |
|||
|} |
|||
=== Sub-routines === |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
Declaration and definition syntax: |
|||
#REDIRECTION OF STDERR WITH system() |
|||
<source lang="perl"> |
|||
# |
|||
sub NAME[(PROTO)] [: ATTRS]; # A "forward" declaration |
|||
#It seems that STDERR can only be redirected if it occurs in the command, not in args! |
|||
sub NAME[(PROTO)] [: ATTRS] BLOCK # A declaration and definition |
|||
system "echo hello world! 2>\\nul"; # OK |
|||
$subref = sub (PROTO) : ATTRS BLOCK; # An anonymous sub-routine, called with &$subref |
|||
system qq(echo hello world! 2>\\nul); # OK |
|||
</source> |
|||
system "echo", "hello world!"," 2>\\nul"; # OK |
|||
Importing a sub-routine: |
|||
system "$CT hello world! 2>\\nul"; # OK |
|||
<source lang="perl"> |
|||
system qq(CT hello world! 2>\\nul); # OK |
|||
use MODULE qw(NAME1 NAME2 NAME3); |
|||
system "$CT", "hello world!"," 2>\\nul"; # NOK (seems that external pgm can not be redirected like that) |
|||
</source> |
|||
Calling a sub-routine: |
|||
<source lang="perl"> |
|||
NAME(LIST); # & is optional with parentheses. |
|||
NAME LIST; # Parentheses optional if predeclared/imported. |
|||
&NAME(LIST); # Circumvent prototypes. |
|||
&NAME; # Makes current @_ visible to called subroutine. |
|||
</source> |
|||
Examples: |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
<source lang="perl"> |
|||
#Searching or modifying arrays: map / grep |
|||
sub mySub1 |
|||
@chars = map(chr, @nums); |
|||
{ |
|||
@foo = grep(!/^#/, @bar); # weed out comments |
|||
my ($param1, $param2) = @_ |
|||
return $param1.$param2; |
|||
} |
|||
sub mySub2 |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
{ |
|||
#Handling of various quoting |
|||
my $param1 = shift |
|||
# |
|||
my $param2 = shift |
|||
# qx `` |
|||
return $param1.$param2; |
|||
# => $VAR is expanded |
|||
} |
|||
# => "..." quote are conversed |
|||
</source> |
|||
# => \ is processed (so use \\ for backslash in a windows path for instance) |
|||
qx($CTNDEBUG lstype \n -fmt "%n\\n" -kind brtype -invob $VOBAdminName); |
|||
Using default value for sub-routine parameters: |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
sub myfunc |
|||
#Choping character: very useful to remove the trailing "\n", also on list! |
|||
{ |
|||
my($suffix) = @_ ? "@_" : "defaultvalue"; |
|||
} |
|||
== Functions == |
|||
See [http://perldoc.perl.org/index-functions.html] for a detailed list of Perl functions. |
|||
=== Chop / Chomp === |
|||
'''<code>chop</code>''' removes the last character of a string. It also works on lists. |
|||
<source lang="perl"> |
|||
chop( my $userinput=<STDIN> ); #Chop the trailing "\n" in user input |
chop( my $userinput=<STDIN> ); #Chop the trailing "\n" in user input |
||
chop( my @list=qx(ls); #Chop the trailing "\n" in the command output |
chop( my @list=qx(ls); #Chop the trailing "\n" in the command output |
||
</source> |
|||
'''<code>chomp</code>''' removes the trailing record separator (typically '''<code>\n</code>''') of a string. It also works on lists. |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
<source lang="perl"> |
|||
#CHOMPing character: safer version (in case for instance the last line doesn't have the \n character |
|||
chomp( my $userinput=<STDIN> ); #Chomp the trailing "\n" in user input IF PRESENT |
chomp( my $userinput=<STDIN> ); #Chomp the trailing "\n" in user input IF PRESENT |
||
chomp( my @list=qx(ls); #Chomp the trailing "\n" in the command output IF PRESENT |
chomp( my @list=qx(ls); #Chomp the trailing "\n" in the command output IF PRESENT |
||
</source> |
|||
=== -X === |
|||
The function '''[http://perldoc.perl.org/functions/-X.html -X]''' can be used for various test on the files, directories... similar to the ''test'' command in ''Bash'': |
|||
"AdminMask" => { ProjectName => "\"\"" }, |
|||
<source lang="perl"> |
|||
"MaskADKSAM" => { ProjectName => "\"MaskADKSAM\""} |
|||
print "The file exists\n" if -e "../somefile"; |
|||
); |
|||
print "The directory exists\n" if -d "../some/directory"; |
|||
</source> |
|||
Use '''_''' to save a system call, like in: |
|||
#----------------------------------------------------------------------------------------------------------------------------- |
|||
<source lang="perl"> |
|||
#passing filehandle as sub parameters and return values --> use reference |
|||
stat($filename); |
|||
# |
|||
print "Readable\n" if -r _; |
|||
# First as return values: |
|||
print "Writable\n" if -w _; |
|||
print "Executable\n" if -x _; |
|||
print "Text\n" if -T _; |
|||
print "Binary\n" if -B _; |
|||
</source> |
|||
Since 5.9.1, operators can be stacked: |
|||
:<source lang="perl" enclose="prevalid">print "writable and executable\n" if -f -w -x $file; # same as -x $file && -w _ && -f _</source> |
|||
== Modules == |
|||
sub openTimeOut($) |
|||
See [http://perldoc.perl.org/index-modules-A.html Core Modules] for a detailed list of Perl modules. Here a list of frequently used ones: |
|||
{ |
|||
my $filename = shift; |
|||
my $timeout=15; |
|||
while( !open(LOG,$filename) ) { sleep 1; --$timeout or die "Time out trying to open file $filename"; } |
|||
return \*LOG; |
|||
} |
|||
=== (CWD) getcwd / abs_path === |
|||
sub printToFile($@) |
|||
The function '''[http://perldoc.perl.org/Cwd.html getcwd]''' returns the current working directory. '''abs_path''' transforms a given relative path into its equivalent canonical absolute form. |
|||
{ |
|||
<source lang="perl"> |
|||
my $filename = shift; |
|||
use Cwd qw(getcwd abs_path); |
|||
my $fh = openTimeOut(">$filename"); |
|||
my $dir = getcwd(); |
|||
print $fh @_; |
|||
my $abs_path = abs_path($file); |
|||
close($fh); |
|||
</source> |
|||
} |
|||
=== File::Find === |
|||
# |
|||
'''<code>File:Find</code>''' provides functions similar to the Unix find command for searching through directory trees doing work on each file. |
|||
# BUT BEWARE, ACTUALLY OpenTimeOut returns a reference to the same FILEHANDLE in current glob ! |
|||
<source lang="perl"> |
|||
# The code below illustrate this: |
|||
use File::Find; |
|||
# |
|||
find(\&wanted, @directories_to_search); #depth-first search - preorder traversal - no options |
|||
# |
|||
sub wanted { ... } |
|||
my ($to,$from) = @_; |
|||
$fhto = openTimeOut(\*TO,">>$to"); |
|||
$fhfrom = openTimeOut(\*FROM,"<$from"); # This returns same FILEHANDLE reference as $fhto |
|||
while (<$fhfrom>) {print $fhto $_} # Failed, because now $fhto = $fhfrom, which only open for output |
|||
close($fhfrom); |
|||
close($fhto); |
|||
use File::Find; |
|||
# |
|||
find({ wanted => \&process, follow => 1 }, '.'); #With options |
|||
# The solution, pass by parameters: |
|||
sub process { ... } |
|||
# |
|||
sub openTimeOut2(*;$) |
|||
{ |
|||
my $fh = shift; |
|||
my $filename = shift; |
|||
my $timeout=15; |
|||
while( !open($fh,$filename) ) { sleep 1; --$timeout or die "Time out trying to open file $filename"; } |
|||
} |
|||
use File::Find; |
|||
sub printToFile($@) |
|||
finddepth(\&wanted, @directories_to_search); #depth-first search - post-order traversal - no options |
|||
{ |
|||
sub wanted { ... } |
|||
my $filename = shift; |
|||
</source> |
|||
openTimeOut2(\*LOG,">$filename"); |
|||
print LOG @_; |
|||
== Examples == |
|||
close(LOG); |
|||
} |
|||
=== One-Liners === |
|||
#----------------------------------------------------------------------------------------------------------------------------- |
|||
See [http://sial.org/howto/perl/one-liner/], [http://www.unixguide.net/unix/perl_oneliners.shtml], [http://www.catonmat.net/blog/perl-one-liners-explained-part-one/], [http://defindit.com/readme_files/perl_one_liners.html]. |
|||
#uppercase / lowercase |
|||
# |
|||
<source lang="bash"> |
|||
perl -ne 'print unless /^$/../^$/' input # print lines, unless blank |
|||
perl -ne 'print if ! /^$/../^$/' input # reduce runs of blank lines to a single blank line |
|||
perl -nle 'print $.; close ARGV if eof' input input # $. need to be reset (by closing ARGV) between 2 input files |
|||
perl -nle 'print for m/\b(\S+)\b/g' paragraphs # print words from file paragraphs |
|||
perl -nle 'while(m/(\S+)\s+the\s+(\S+)/g){print "$1 $2"}' paragraphs # ... while loop needed when using multiple back-references |
|||
perl -lne 'print for /id is <(\d+)>/g' # match pattern and extract backreference |
|||
perl -lne 'print $2 for /id is <(\d+)> or <(\d+)>/g' # ... print 2nd matched backreference |
|||
cat oldfile | perl -pe 's/(\d+)_/sprintf("%2.2d_",$1)/e' > newfile # evaluate regex substitutions |
|||
</source> |
|||
=== Miscellaneous === |
|||
{| class="wikitable" |
|||
|- |
|||
|'''Report a complete error message when loading a script'''. This hack allows for printing a custom error message + file not found-like error message (given by '''<code>$!</code>''') + syntax error messages ('''<code>@_</code>''') |
|||
|<source lang="perl" |
|||
do "your script.pl" |
|||
or (print "Your error message\n$!\n" and die @_); |
|||
</source> |
|||
|- |
|||
|'''uppercase / lowercase''' |
|||
|<source lang="perl"> |
|||
my $lowercase = lc "My StRiNg"; #mystring |
my $lowercase = lc "My StRiNg"; #mystring |
||
my $uppercase = uc "My StRiNg"; #MYSTRING |
my $uppercase = uc "My StRiNg"; #MYSTRING |
||
my $firstcharlowercase = lcfirst "My StRiNg"; #my StRiNg |
my $firstcharlowercase = lcfirst "My StRiNg"; #my StRiNg |
||
my $firstcharuppercase = ucfirst "My StRiNg"; #My StRiNg |
my $firstcharuppercase = ucfirst "My StRiNg"; #My StRiNg |
||
|} |
|||
</source> |
|||
=== |
=== Split a multiline variable/output === |
||
Method 1: |
|||
<source lang=perl> |
|||
<source lang="perl"> |
|||
#!/usr/bin/perl |
|||
my @ArrayList = `$CT lsvob -short`; #any command producing a multi-line output |
|||
foreach (@ArrayList) |
|||
use strict; |
|||
my @myarray; |
|||
foreach my $iter ( 1..10 ) |
|||
{ |
{ |
||
chop(); #remove the trailing newline |
|||
my $value1 = "value1_".$iter; |
|||
print "Array List - The VOB is $_.\n"; |
|||
my $value2 = "value2_".$iter; |
|||
}; |
|||
my $value3 = "value3_".$iter; |
|||
</source> |
|||
Method 2: |
|||
<source lang="perl"> |
|||
my $ScalarList = `$CT lsvob -short`; #any command producing a multi-line output |
|||
my @ArrayList2 = split/\n/,$ScalarList; #split the scalar into several lines |
|||
foreach (@ArrayList2) |
|||
print "Creating our \$hashref... "; |
|||
# Construct { key1 => value1, key2 => value2.... } creates a REFERENCE to an anonymous hash. |
|||
# Since reference are SCALAR, we assign it to a scalar variable |
|||
my $hashref = { index1 => $value1, index2 => $value2, index3 => $value3 }; |
|||
print "Done.\n", |
|||
" \$hashref: ",$hashref,"\n"; |
|||
print " content: ",$$hashref{'index1'},",",$$hashref{'index2'},",",$$hashref{'index2'},"\n"; |
|||
print "Adding \$hashref to our array... "; |
|||
push( @myarray, $hashref ); |
|||
print "Done. There are currently ", scalar(@myarray), " elements in \@myarray.\n"; |
|||
print "Accessing last element of our array..."; |
|||
print " content: @myarray[$#myarray], ${@myarray[$#myarray]}{'index1'} our better yet @myarray[$#myarray]->{'index2'}\n"; |
|||
} |
|||
print "\n\nNow we will traverse our array again...\n"; |
|||
foreach ( @myarray ) |
|||
{ |
{ |
||
print |
print "Scalar List - The VOB is $_.\n"; |
||
}; |
|||
"index1 => $$_{'index1'},", |
|||
"index2 => $$_{'index2'},", |
|||
"index3 => $$_{'index3'}\n"; |
|||
print "... or using -> operator: ", |
|||
"index1 => $_->{'index1'},", |
|||
"index2 => $_->{'index2'},", |
|||
"index3 => $_->{'index3'}\n"; |
|||
} |
|||
</source> |
</source> |
||
Line 671: | Line 711: | ||
die "http-get failed: ".$res->status_line, "\n$url\n"; |
die "http-get failed: ".$res->status_line, "\n$url\n"; |
||
} |
} |
||
my $te = HTML::TableExtract->new ( slice_columns => 0, |
my $te = HTML::TableExtract->new ( slice_columns => 0, |
||
keep_html => 1, |
keep_html => 1, |
||
Line 695: | Line 735: | ||
</source> |
</source> |
||
=== Passing filehandle as sub parameters and return values === |
|||
== Pitfalls == |
|||
This requires the use of a reference. First as return value: |
|||
<source lang="perl"> |
|||
sub openTimeOut($) |
|||
{ |
|||
my $filename = shift; |
|||
my $timeout=15; |
|||
while( !open(LOG,$filename) ) { sleep 1; --$timeout or die "Time out trying to open file $filename"; } |
|||
return \*LOG; |
|||
} |
|||
sub printToFile($@) |
|||
{ |
|||
my $filename = shift; |
|||
my $fh = openTimeOut(">$filename"); |
|||
print $fh @_; |
|||
close($fh); |
|||
} |
|||
</source> |
|||
'''BUT BEWARE''', actually '''<code>OpenTimeOut</code>''' returns a reference to the '''same''' file handle in current glob! The code below illustrate this: |
|||
<source lang="perl"> |
<source lang="perl"> |
||
my ($to,$from) = @_; |
|||
# Frequent Mistakes in Perl |
|||
$fhto = openTimeOut(\*TO,">>$to"); |
|||
die "can't run this"; |
|||
$fhfrom = openTimeOut(\*FROM,"<$from"); # This returns same FILEHANDLE reference as $fhto |
|||
while (<$fhfrom>) {print $fhto $_} # Failed, because now $fhto = $fhfrom, which only open for output |
|||
close($fhfrom); |
|||
close($fhto); |
|||
</source> |
|||
The solution, pass by parameters: |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
<source lang="perl"> |
|||
#Forget to chop the trailing "\n" |
|||
sub openTimeOut2(*;$) |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
{ |
|||
my $path = qx(pwd); #NOK! trailing \n will corrupt path construction |
|||
my $fh = shift; |
|||
chop( my $path = qx(pwd) ); #OK! |
|||
my $filename = shift; |
|||
my $timeout=15; |
|||
while( !open($fh,$filename) ) { sleep 1; --$timeout or die "Time out trying to open file $filename"; } |
|||
} |
|||
sub printToFile($@) |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
{ |
|||
#Mix case in name of package |
|||
my $filename = shift; |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
openTimeOut2(\*LOG,">$filename"); |
|||
# Imagine a module file named Vobs.pm |
|||
print LOG @_; |
|||
close(LOG); |
|||
} |
|||
</source> |
|||
=== Embedding a perl script in a W2K shell script |
|||
use Vobs; |
|||
Notice how the first '''<code>rem</code>''' is actually a multiline assignment to perl array variable '''<code>@rem</code>''', where the value is quoted with '''<code>' '</code>'''. |
|||
use VOBs; # NOK --> Will complain about double definition (but will not flag the mix case problem) |
|||
<source lang="winbatch"> |
|||
@rem= 'PERL for Windows NT - ccperl must be in search path |
|||
@echo off |
|||
ccperl %0 %1 %2 %3 %4 %5 %6 %7 %8 %9 |
|||
goto endofperl |
|||
@rem '; |
|||
# Your Perl code comes here |
|||
#----------------------------------------------------------------------------------------------------------------------------------- |
|||
# Beware of operator precedence and strange behaviour |
|||
# End of Perl section |
|||
__END__ |
|||
:endofperl |
|||
</source> |
|||
== Pitfalls == |
|||
{| class="wikitable" |
|||
|- |
|||
|Forgetting to '''chomp the trailing "\n"''' |
|||
|<source lang="perl"> |
|||
my $path = qx(pwd); #NOK! trailing \n will corrupt path construction |
|||
chomp( my $path = qx(pwd) ); #OK! |
|||
</source> |
|||
|- |
|||
|'''Mixing case''' in name of package |
|||
|<source lang="perl"> |
|||
# Imagine a module file named Vobs.pm |
|||
use Vobs; # OK |
|||
use VOBs; # NOK &rar; Will complain about double definition |
|||
# (but will not flag the mix case problem) |
|||
</source> |
|||
|- |
|||
|'''Operator precedence''' and strange behaviour |
|||
|<source lang="perl"> |
|||
chomp my @emptylist = qx("dir"); #NOK ! @emptylist will be empty |
chomp my @emptylist = qx("dir"); #NOK ! @emptylist will be empty |
||
chomp ( my @list = qx("dir") ); #OK ! |
chomp ( my @list = qx("dir") ); #OK ! |
||
</source> |
</source> |
||
|- |
|||
|Forgetting to '''use ${...}''' to separate variable identifier |
|||
|<source lang="perl"> |
|||
my $variable; |
|||
print "$variable_temp\n"; # NOK! Print a variable named variable_temp |
|||
print "${variable}_temp\n"; # OK! Print a $variable, followed by "_temp" |
|||
</source> |
|||
|- |
|||
|'''STDERR redirection''' cannot be given as a command parameter to system |
|||
|<source lang="perl"> |
|||
system "echo hello world! 2>\\nul"; # OK |
|||
system qq(echo hello world! 2>\\nul); # OK |
|||
system "echo", "hello world!"," 2>\\nul"; # NOK - 2>\\nul is taken as a parameter |
|||
</source> |
|||
|- |
|||
|Forgetting '''local''' in sub-routines (see [http://perldoc.perl.org/perlsub.html]). In particular pay attention that '''<code>$_</code>''' is assigned e.g. in while loops |
|||
|<source lang="perl"> |
|||
sub localized |
|||
{ |
|||
local @ARGV = ("/etc/motd"); # OK |
|||
local $/ = undef; # OK |
|||
local $_ = <>; # OK |
|||
@Fields = split /^\s*=+\s*$/; |
|||
} |
|||
</source> |
|||
|} |
|||
== CPAN - Perl Packages == |
== CPAN - Perl Packages == |
||
Line 728: | Line 857: | ||
$ perl -MCPAN -e shell # --> yes auto config |
$ perl -MCPAN -e shell # --> yes auto config |
||
</source> |
</source> |
||
To adapt config related to proxy: |
To adapt config related to proxy: |
||
<source lang=perl> |
<source lang=perl> |
||
Line 734: | Line 863: | ||
cpan> o conf commit |
cpan> o conf commit |
||
</source> |
</source> |
||
To install a Perl package (eg. here package ''GetOpt::Long''): |
To install a Perl package (eg. here package ''GetOpt::Long''): |
||
<source lang=perl> |
<source lang=perl> |
Revision as of 13:02, 28 April 2010
Reference
- Perldoc on local computer
% perldoc -q duplicate
"How can I remove duplicate elements from a list or array?"
% perldoc -f split
split /PATTERN/,EXPR,LIMIT
split /PATTERN/,EXPR
split /PATTERN/
...
- Links
- FAQ
- The FAQ is the primary source of answer to questions like How can I do....
- Manpages - List of highly recommended perldoc manpages (from the FAQ).
Basics perldata, perlvar, perlsyn, perlop, perlsub Execution perlrun, perldebug Functions perlfunc Objects perlref, perlmod, perlobj, perltie Data Structures perlref, perllol, perldsc Modules perlmod, perlmodlib, perlsub Regexes perlre, perlfunc, perlop, perllocale Moving to perl5 perltrap, perl Linking w/C perlxstut, perlxs, perlcall, perlguts, perlembed Various http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz (not a man-page but still useful, a collection of various essays on Perl techniques)
- Command-Line - Useful command-line options
- -e expression
- specififies perl expressions.
- -p
- loops over and prints input.
- -n
- loops over and does not print input.
- -l
- strip newlines on input, and adds them on output. Use this option by default, unless the newlines need special handling, or for efficiency reasons.
Quick Introduction
Program Structure
Example of a simple Hello World program:
#!/usr/bin/perl
use strict; # Immediately stops on potential problem - highly recommended for simplified debugging
use warning; # Warnings - highly recommended for simplified debugging
print "Hello, World!\n";
exit 0;
Data Types
$ |
for scalar values (number, string or reference) |
@ |
for arrays |
% |
for hashes (associative arrays) |
& |
for subroutines (aka functions, procedures, methods) |
* |
for all types of that symbol name. In version 4 you used them like pointers, but in modern perls you can just use references. |
<> |
are used for inputting a record from a filehandle. |
\ |
takes a reference to something. |
Note that the last 2 are not really type specifiers.
Arrays
Some example
my @array1 = ("titi","tutu"); # (...) is an array constructor
my @array2 = ("tata","toto");
push(@array1,"tete"); # Append an element to an array
push(@array1,@array2); # Append another array to an array
Arrays can be easily constructed through autovivification. Below we create a hash of arrays
my %Projects; # Projects is a hash, but we say nothing on the types of its elements...
foreach my $VOBName (keys %VOBs)
{
my $ProjectName = $VOBs{$VOBName}{'ProjectName'};
push(@{$Projects{$ProjectName}}, $VOBName); # <-- we dereference value returned by $Projects{$ProjectName} as
} # an array, hence creating automatically an array if undef
Below some difference of handling @
in SCALAR or LIST context:
# RESULT CONTEXT EXPLANATION
my @a = ("titi","tutu");
my $varnoquote=@a; print "$varnoquote\n"; # "2" (SCALAR - @_ is evaluated in scalar context)
my $varquote="@a"; print "$varquote\n"; # "titi tutu" (EXPAND - @_ is quote-expanded, each item being separated by space)
print @a; print"\n"; # "tititutu" (LIST - $, is empty)
print(@a); print"\n"; # "tititutu" (LIST - $, is empty)
printf @a; print"\n"; # "titi" (LIST - 1st element in list is interpreted as the format string)
printf(@a); print"\n"; # "titi" (LIST - 1st element in list is interpreted as the format string)
print @a,"\n"; # "tititutu" (LIST - $, is empty)
printf "%s\n",@a; # "titi" (LIST - only 1st element is read)
Set variable $,
to modify the list separator used when printing arrays
my @a = ("titi","tutu");
$,="\n";
print @a;
Hashes
Some example of hashes:
my %cities = ( #(...) is a hash constructor
"US" => "Washington",
"GB" => "London"
);
print $cities{"US"},"\n";
my %hashofhash = ( #This is actually a hash of references to hash
"address" => {name => "US", city => "Washington" },
"identity" => {firstname => "smith", lastname => "Smith" } );
print $hashofhash{"address"}{"name"},"\n";
print $hashofhash{"address"}->{"name"},"\n";
Note that in LIST context, a hash is transformed into an array containing both the keys and values in the hash!
my %myhash = ( key1 => "value1", key2 => "value2" );
my @myarray= ( "element1", "element2" );
push (@myarray, %myhash);
$, = ",";
print @myarray; # outputs "element1, element2, key2, value2, key1, value1"
References
my $VOBAttrRef = $VOBs{'AdminMask'}; # This return a reference to a Hash
my %VOBAttr = %$VOBAttrRef; # This dereference the reference above and return a Hash
print $VOBAttr{'ProjectName'},"\n"; # We can use our new Hash variable
print $$VOBAttrRef{'ProjectName'},"\n"; # ... or we can dereference our reference variable using the $$ construct
print $VOBAttrRef->{'ProjectName'},"\n"; # ... but -> can also be used to dereference
print $VOBs{'AdminMask'}->{'ProjectName'},"\n"; # We can also skip altoghether the reference variable
print $VOBs{'AdminMask'}{'ProjectName'},"\n"; # ... This notation is also available as a shortcut, -> can be omitted
Passing reference to sub-routines:
$tab{'somekey'} = '...';
process(\$tab);
sub process ()
{
my $tab = $_[0];
$tab->{'somekey'} = '...';
}
Using Anonymous Hash References:
#!/usr/bin/perl
use strict;
my @myarray;
foreach my $iter ( 1..10 )
{
my $value1 = "value1_".$iter;
my $value2 = "value2_".$iter;
print "Creating our \$hashref... ";
my $hashref = { index1 => $value1, index2 => $value2 }; # { key1 => value1, ... } creates a REFERENCE to an anonymous hash.
# Since reference are SCALAR, we assign it to a scalar variable
print "Done.\n",
" \$hashref: ",$hashref,"\n";
print " content: ",$$hashref{'index1'},",",$$hashref{'index2'},"\n";
print "Adding \$hashref to our array... ";
push( @myarray, $hashref );
print "Done. There are currently ", scalar(@myarray), " elements in \@myarray.\n";
print "Accessing last element of our array...";
print " content: @myarray[$#myarray], ${@myarray[$#myarray]}{'index1'} or better yet @myarray[$#myarray]->{'index2'}\n";
}
print "\n\nNow we will traverse our array again...\n";
foreach ( @myarray )
{
print "$_ containing ",
"index1 => $$_{'index1'},",
"index2 => $$_{'index2'}\n";
print "... or using -> operator: ",
"index1 => $_->{'index1'},",
"index2 => $_->{'index2'}\n";
}
String
# Concat 2 strings
$stringC = $stringA . ucfirst($stringB);
$stringC = "$stringA$stringB";
$stringC = join('', ($stringA, ucfirst($stringB)));
If / For / While ...
TBC
Operators
Quote and quote-like operators
See perldoc for detailed information.
Customary | Generic | Meaning | Interpolates |
---|---|---|---|
|
q{} |
Literal | no |
"" |
qq{} |
Literal | yes |
`` |
qx{} |
Command | yes(*) |
|
qw{} |
Word list | no |
// |
m{} |
Pattern match | yes(*) |
|
qr{} |
Pattern | yes(*) |
|
s{}{} |
Substitution | yes(*) |
|
tr{}{} |
Transliteration | no (but see below) |
|
<<EOF |
here-doc | yes(*) |
- (*) unless th delimiter is
.
- (*) unless th delimiter is
Interpolates means that variables like $VAR
are expanded, and that escaped sequence like \n
are processed.
Also other delimiters can be used. For instance:
#Use any brackets
print q{Hello World};
print q(Hello World);
print q[Hello World];
print q<Hello World>;
#Brackets delimiters nest correctly, like
print q{Hello {my} World}; # Equivalent to 'Hello {my} World!
#We can use any non-whitespace character
print q!Hello World!;
print q|Hello World|;
print q#Hello World#;
Beware of some caveats:
$s = q{ if($a eq "}") ... }; # WRONG - } inside "}" is not nested, so quoting will stop there
$s = q #Hello World# # WRONG - Because of the whitespace, #Hello World# is taken as a comment
Regular expressions
Use /regex/
or m!regex!
(where !
can be any quoting character).
Use =~
to match a given variable, otherwise $_
is used. Use !~
to reverse the sense of the match.
Finding matches
In SCALAR context, /regex/
returns true/false if matching is found
$myvar =~ /World/ #scalar context, returns true if $myvar contains World
/World/ #scalar context, same as above except that now it is $_ that is matched
"Hello World" =~ /World/ #scalar contect, same as above, to show that left member doesn't need to be an L-Value
Extracting matches
The grouping metacharacters ()
also allow the extraction of the parts of a string that matched. For each grouping, the part that matched inside goes into the special variables $1
, $2
... They can be used just as ordinary variables:
# extract hours, minutes, seconds
$time =~ /(\d\d):(\d\d):(\d\d)/; # match hh:mm:ss format
$hours = $1;
$minutes = $2;
$seconds = $3;
In LIST context, /regex/
with groupings will return the list of matched values ($1,$2,...) . So we could rewrite the above as:
($hours, $minutes, $second) = ($time =~ /(\d\d):(\d\d):(\d\d)/);
If the groupings in a regex are nested, $1
gets the group with the leftmost opening parenthesis, $2
the next opening parenthesis... For example, here is a complex regex and the matching variables indicated below it:
/(ab(cd|ef)((gi)|j))/; 1 2 34
Using back-references
Associated with the matching variables $1
, $2
... are the backreferences \1
, \2
... Backreferences are matching variables that can be used inside a regex:
/(\w\w\w)\s\1/; # find sequences like 'the the' in string
Note that $1
, $2
.... should only be used outside of a regex, and \1
, \2
... only inside a regex.
Search & Replace
Use s/regex/replacement/modifiers
. Use =~
to match a given variable, otherwise $_
is used.
In SCALAR context, s///
returns the number of matches, or false if no match.
$x = "Time to feed the cat!";
$x =~ s/cat/hacker/; # $x contains "Time to feed the hacker!"
Note that the matching variablle $1
, $2
can be used in the replacement string.
Some modifiers:
g
- Find all matchese
- wraps aneval{...}
around the replacement string and the evaluated result is substituted for the matched substring. Example:
# reverse all the words in a string
$x = "the cat in the hat";
$x =~ s/(\w+)/reverse $1/ge; # $x contains "eht tac ni eht tah"
The split operator
split /regex/, string
splits string into a list of substrings and returns that list. The regex determines the character sequence that string is split with respect to. For example, to extract a comma-delimited list of numbers, use
$x = "1.618,2.718, 3.142";
@const = split /,\s*/, $x; # $const[0] = '1.618', $const[1] = '2.718', $const[2] = '3.142'
If the empty regex //
is used, the string is split into individual characters. If the regex has groupings, then the list produced contains the matched substrings from the groupings as well:
$x = "/usr/bin";
@parts = split m!(/)!, $x; # $parts[0] = '' Since the first character of $x matched the regex, an initial element was prepended.
# $parts[1] = '/' The delimiter is also in the list because of the grouping (/)
# $parts[2] = 'usr'
# $parts[3] = '/' Yet a delimiter because of the grouping
# $parts[4] = 'bin'
Lookahead / Lookbehind
The lookahead and lookbehind assertions are generalizations of the anchor concept. Lookahead and lookbehind are zero-width assertions that let us specify which characters we want to test for. The lookahead assertion is denoted by (?=regexp)
and the lookbehind assertion is denoted by (?<=fixed-regexp)
. Some examples are
$x = "I catch the housecat 'Tom-cat' with catnip";
$x =~ /cat(?=\s+)/; # matches 'cat' in 'housecat'
@catwords = ($x =~ /(?<=\s)cat\w+/g); # matches, $catwords[0] = 'catch' $catwords[1] = 'catnip'
$x =~ /\bcat\b/; # matches 'cat' in 'Tom-cat'
$x =~ /(?<=\s)cat(?=\s)/; # doesn't match; no isolated 'cat' in middle of $x
Grep / Map
Use grep
on a list to return the element of that list for which the expression is true. For instance
@foo = grep(!/^#/, @bar); # Only returns line that are not comments
my @array = ("el1","gel2","el3","gel1","gel2");
my @array2 = grep {s/(.*el)/reverse $1/e} @array; # grep may also modify the elements in the returned list
Use map
on a list to apply a given expression on all elements in the list.
@chars = map(chr, @nums); # Returns the list of character corresponding to the list of of numbers
File and I/O
chdir (function)
Change the current working directory
-X (function)
Various test on files, directories... pretty much like in Bash scripts.
getcwd (module CWD)
Get the current working directory
abs_path
Transform a relative path into absolute path
Read something from standard input
$line = <STDIN>;
$line = readline(*STDIN); # same thing
chop($line = <STDIN>); # remove trailing newline
Read one character from STDIN
print "Press RETURN...";
$key = getc();
System calls
system "echo hello world!";
system qq(echo hello world!);
system $MYCMD, qw(param1), 'the name is'.getname($index);
Discard STDERR on Windows / Linux. Note that on Windows, we use \nul because each folder as a nul handler and we want to reduce the number of used handle
my $STDERRNULL = "2>\\nul"; #use this on windows
my $STDERRNULL = "2>/dev/null"; #use this on unix
my @CTResults = qx(ls somedirectory $STDERRNULL);
Capture STDOUT
my @ouput = `ls`;
my @ouput = qx(ls);
system("ls >output.txt");
Capture command exit status
my $exit_status = system("del file.txt");
Temporarily disable STDERR and restore it afterwards
open(SAVE_STDERR, '>&STDERR');
close(STDERR) unless $ENV{CLEARCASE_TRACE_TRIGGERS};
$exe = qx(file $ptmp) =~ /executable|bourne|commands text|\bscript/i;
open(STDERR, '>&SAVE_STDERR');
close(SAVE_STDERR);
Sub-routines
Declaration and definition syntax:
sub NAME[(PROTO)] [: ATTRS]; # A "forward" declaration
sub NAME[(PROTO)] [: ATTRS] BLOCK # A declaration and definition
$subref = sub (PROTO) : ATTRS BLOCK; # An anonymous sub-routine, called with &$subref
Importing a sub-routine:
use MODULE qw(NAME1 NAME2 NAME3);
Calling a sub-routine:
NAME(LIST); # & is optional with parentheses.
NAME LIST; # Parentheses optional if predeclared/imported.
&NAME(LIST); # Circumvent prototypes.
&NAME; # Makes current @_ visible to called subroutine.
Examples:
sub mySub1
{
my ($param1, $param2) = @_
return $param1.$param2;
}
sub mySub2
{
my $param1 = shift
my $param2 = shift
return $param1.$param2;
}
Using default value for sub-routine parameters:
sub myfunc
{
my($suffix) = @_ ? "@_" : "defaultvalue";
}
Functions
See [1] for a detailed list of Perl functions.
Chop / Chomp
chop
removes the last character of a string. It also works on lists.
chop( my $userinput=<STDIN> ); #Chop the trailing "\n" in user input
chop( my @list=qx(ls); #Chop the trailing "\n" in the command output
chomp
removes the trailing record separator (typically \n
) of a string. It also works on lists.
chomp( my $userinput=<STDIN> ); #Chomp the trailing "\n" in user input IF PRESENT
chomp( my @list=qx(ls); #Chomp the trailing "\n" in the command output IF PRESENT
-X
The function -X can be used for various test on the files, directories... similar to the test command in Bash:
print "The file exists\n" if -e "../somefile";
print "The directory exists\n" if -d "../some/directory";
Use _ to save a system call, like in:
stat($filename);
print "Readable\n" if -r _;
print "Writable\n" if -w _;
print "Executable\n" if -x _;
print "Text\n" if -T _;
print "Binary\n" if -B _;
Since 5.9.1, operators can be stacked:
print "writable and executable\n" if -f -w -x $file; # same as -x $file && -w _ && -f _
Modules
See Core Modules for a detailed list of Perl modules. Here a list of frequently used ones:
(CWD) getcwd / abs_path
The function getcwd returns the current working directory. abs_path transforms a given relative path into its equivalent canonical absolute form.
use Cwd qw(getcwd abs_path);
my $dir = getcwd();
my $abs_path = abs_path($file);
File::Find
File:Find
provides functions similar to the Unix find command for searching through directory trees doing work on each file.
use File::Find;
find(\&wanted, @directories_to_search); #depth-first search - preorder traversal - no options
sub wanted { ... }
use File::Find;
find({ wanted => \&process, follow => 1 }, '.'); #With options
sub process { ... }
use File::Find;
finddepth(\&wanted, @directories_to_search); #depth-first search - post-order traversal - no options
sub wanted { ... }
Examples
One-Liners
perl -ne 'print unless /^$/../^$/' input # print lines, unless blank
perl -ne 'print if ! /^$/../^$/' input # reduce runs of blank lines to a single blank line
perl -nle 'print $.; close ARGV if eof' input input # $. need to be reset (by closing ARGV) between 2 input files
perl -nle 'print for m/\b(\S+)\b/g' paragraphs # print words from file paragraphs
perl -nle 'while(m/(\S+)\s+the\s+(\S+)/g){print "$1 $2"}' paragraphs # ... while loop needed when using multiple back-references
perl -lne 'print for /id is <(\d+)>/g' # match pattern and extract backreference
perl -lne 'print $2 for /id is <(\d+)> or <(\d+)>/g' # ... print 2nd matched backreference
cat oldfile | perl -pe 's/(\d+)_/sprintf("%2.2d_",$1)/e' > newfile # evaluate regex substitutions
Miscellaneous
Report a complete error message when loading a script. This hack allows for printing a custom error message + file not found-like error message (given by $!
) + syntax error messages (@_
)
|-
|'''uppercase / lowercase'''
|<source lang="perl">
my $lowercase = lc "My StRiNg"; #mystring
my $uppercase = uc "My StRiNg"; #MYSTRING
my $firstcharlowercase = lcfirst "My StRiNg"; #my StRiNg
my $firstcharuppercase = ucfirst "My StRiNg"; #My StRiNg
|}
=== Split a multiline variable/output ===
Method 1:
<source lang="perl">
my @ArrayList = `$CT lsvob -short`; #any command producing a multi-line output
foreach (@ArrayList)
{
chop(); #remove the trailing newline
print "Array List - The VOB is $_.\n";
};
Method 2:
my $ScalarList = `$CT lsvob -short`; #any command producing a multi-line output
my @ArrayList2 = split/\n/,$ScalarList; #split the scalar into several lines
foreach (@ArrayList2)
{
print "Scalar List - The VOB is $_.\n";
};
Parsing Command Line Parameters
Command line parameters are parsed through variable ARGV.
print scalar @ARGV; #number of parameters
print $#ARGV; #... idem
print "1st param: $ARGV[0]"; #positional parameters
print "2nd param: $ARGV[1]";
print "Executable name: $0"; #Name of current executable
usage() unless defined($ARGV[0]) # defined($ARGV[0]) is true if there is a parameter
Simple version
#!/usr/bin/perl
use strict;
use warnings;
my $verbose=0;
my $projectdir;
# Parse command options (-option).
while ($#ARGV>=0 && $ARGV[0] =~ m/^\-/ ) {
$verbose=1 if $ARGV[0] =~ m/^\-v/i;
shift @ARGV;
}
# Parse mandatory parameter
usage() unless defined($ARGV[0]);
$projectdir=$ARGV[0];
# Show parsing result
print "verbose=$verbose\n";
print "projectdir=$projectdir\n";
exit 0;
sub usage {
print "Usage: $0 [options] directory\n";
print "\n";
print " Options:\n";
print " -v verbose mode\n";
exit;
}
Using GetOpt
use strict;
use Getopt::Long qw(:config no_ignore_case);
my $debug=0;
my $quiet=0;
my username;
# Parse options
GetOptions ("d|debug+" => \$debug,
"q|quiet" => \$quiet,
"u|user=s" => \$username ) || usage();
# Parse remaining parameters
my $url = $ARGV[0];
Internet
#!/usr/bin/perl
use strict;
use LWP::UserAgent;
use LWP::Debug;
use HTTP::Cookies;
use HTML::TableExtract;
my $debug = 0; # Set to 1 for debug information
my $proxy = 0;
my $username;
my $password;
my $url = $ARGV[0];
LWP::Debug::level('+') if $debug;
my $ua = LWP::UserAgent->new;
# Session cookie
my $jar = HTTP::Cookies->new ();
$ua->cookie_jar($jar);
# Enable proxy...
$ua->env_proxy if $proxy;
# Fetch the articles + url
my %articles = get_articles ( $ua, $url );
# Get starting URL....
my $res = $ua->get($url);
unless ($res->is_success) {
die "Die: " . $res->status_line, "\n";
}
exit 0;
print $res->content if $debug;
my $res = $ua->post( $url,
[
'j_username' => $username,
'j_password' => $password,
'Submit' => 'Entrer'
]
);
unless ($res->is_success) {
die "Die: ".$res->status_line, "\n";
}
sub get_articles {
my $ua = shift;
my $url = shift;
my $res = $ua->get ($url);
unless ($res->is_success) {
die "http-get failed: ".$res->status_line, "\n$url\n";
}
my $te = HTML::TableExtract->new ( slice_columns => 0,
keep_html => 1,
keep_headers => 1,
subtables => 1,
headers => [qw(Matter)] );
$te->parse($res->content);
open(my $fh, ">mpe.html") || die "Cannot create file: $!\n";
print $fh $res->content;
close($fh);
foreach my $ts ( $te->tables ) {
# print "Table (", join(',', $ts->coords), "):\n";
foreach my $row ( $ts->rows ) {
print "Row: " . join (';', @$row ). "\n";
next unless $row->[0] =~ m/\/content\/(.*)\/fulltext/;
print $1."\n";
}
}
return 0;
}
Passing filehandle as sub parameters and return values
This requires the use of a reference. First as return value:
sub openTimeOut($)
{
my $filename = shift;
my $timeout=15;
while( !open(LOG,$filename) ) { sleep 1; --$timeout or die "Time out trying to open file $filename"; }
return \*LOG;
}
sub printToFile($@)
{
my $filename = shift;
my $fh = openTimeOut(">$filename");
print $fh @_;
close($fh);
}
BUT BEWARE, actually OpenTimeOut
returns a reference to the same file handle in current glob! The code below illustrate this:
my ($to,$from) = @_;
$fhto = openTimeOut(\*TO,">>$to");
$fhfrom = openTimeOut(\*FROM,"<$from"); # This returns same FILEHANDLE reference as $fhto
while (<$fhfrom>) {print $fhto $_} # Failed, because now $fhto = $fhfrom, which only open for output
close($fhfrom);
close($fhto);
The solution, pass by parameters:
sub openTimeOut2(*;$)
{
my $fh = shift;
my $filename = shift;
my $timeout=15;
while( !open($fh,$filename) ) { sleep 1; --$timeout or die "Time out trying to open file $filename"; }
}
sub printToFile($@)
{
my $filename = shift;
openTimeOut2(\*LOG,">$filename");
print LOG @_;
close(LOG);
}
=== Embedding a perl script in a W2K shell script
Notice how the first rem
is actually a multiline assignment to perl array variable @rem
, where the value is quoted with ' '
.
@rem= 'PERL for Windows NT - ccperl must be in search path
@echo off
ccperl %0 %1 %2 %3 %4 %5 %6 %7 %8 %9
goto endofperl
@rem ';
# Your Perl code comes here
# End of Perl section
__END__
:endofperl
Pitfalls
Forgetting to chomp the trailing "\n"
my $path = qx(pwd); #NOK! trailing \n will corrupt path construction
chomp( my $path = qx(pwd) ); #OK!
Mixing case in name of package
# Imagine a module file named Vobs.pm
use Vobs; # OK
use VOBs; # NOK &rar; Will complain about double definition
# (but will not flag the mix case problem)
Operator precedence and strange behaviour
chomp my @emptylist = qx("dir"); #NOK ! @emptylist will be empty
chomp ( my @list = qx("dir") ); #OK !
Forgetting to use ${...} to separate variable identifier
my $variable;
print "$variable_temp\n"; # NOK! Print a variable named variable_temp
print "${variable}_temp\n"; # OK! Print a $variable, followed by "_temp"
STDERR redirection cannot be given as a command parameter to system
system "echo hello world! 2>\\nul"; # OK
system qq(echo hello world! 2>\\nul); # OK
system "echo", "hello world!"," 2>\\nul"; # NOK - 2>\\nul is taken as a parameter
Forgetting local in sub-routines (see [6]). In particular pay attention that $_
is assigned e.g. in while loops
sub localized
{
local @ARGV = ("/etc/motd"); # OK
local $/ = undef; # OK
local $_ = <>; # OK
@Fields = split /^\s*=+\s*$/;
}
CPAN - Perl Packages
First time launch:
$ cpan # ... OR ...
$ perl -MCPAN -e shell # --> yes auto config
To adapt config related to proxy:
cpan> o config init /proxy/ # (to enter an empty string, simply enter 1 space char as a value)
cpan> o conf commit
To install a Perl package (eg. here package GetOpt::Long):
$ cpan
cpan>install GetOpt::Long
Editing the configruation:
cpan> o conf init # Reset the configuration
cpan> o conf http_proxy http://proxy:8080/ # Edit a given variable (eg. here changing proxy settings):
cpan> o conf commit # To commit changes in the configuration
cpan> o # to get o options
cpan> o conf # To get o conf option
To edit CPAN database url:
cpan> o conf /urllist/
cpan> o conf init /urllist/
cpan> o conf urllist shift
cpan> o conf urllist unshift ftp://my.new.site/
cpan> o conf commit
To update CPAN itself:
cpan> install Bundle::CPAN
cpan> reload cpan