|
PROGRAMMING IN PERL FOR THE
WORLD WIDE WEB
Perl - What is it?
What is this thing called Perl? First if all,
Perl is not a misspelling of "pearl." It is an acronym that stands for
Practical Extraction Report Language. The language was created by Larry Wall in
1987. It was created when he realized that the popular scripting language of the
time was not enough and the more powerful programming language, C, was just too
much. Larry Wall, developed the language to handle file information extraction
with ease. To do this, he created Perl by borrowing from various UNIX tools and
languages.
The Perl language was originally only for the
UNIX system and became a language of choice by system administrators and
programmers. This is because it was a flexible, quick-to-program language which
was freely distributed. As the Internet grew, Perl became a popular languages
for creating CGIs (Common Gateway Interface), since most of the early Web severs
were run on UNIX systems.
As the years have gone by, Perl's
functionality has only grown. It is now in its fifth major version with Perl 5
and has support for many different operating systems (no longer a UNIX only
language). Perl can run on Windows 95/98/NT, Macintosh, Linux, BeOS, and so many
more systems. It is still freely distributed and only getting better.
Perl is not exactly a "programming"
language, it is a "scripting" language. A programming language is
compiled (like C/C++ or Java>, where a scripting language is interpreted
(like HTML or JavaScript). Because Perl is a interpreted language and because it
is supported by so many different systems, Perl is portable. A script written on
a UNIX system should be able to run on a Windows 98 machine. There are slight
differences between Perl for different systems, but this has been brought to a
minimal occurrence in Perl 5.
Here is hello.pl again:
1.
#!/usr/bin/perl -w
2.
#
3.
# This is our first try, the "Hello World!" test
4.
5.
print "Hello World!\n";
As before, line 1 contains
the "she bang". This line is not needed, but it makes running the
script easier for UNIX and Linux systems. If you are not using either of these
systems and do not plan to ever run your script on either of these systems, then
you could leave the first line out. However it is only one line and you never
know where your script may be needed or used.
The number sign (#)
is how you make a one line comment in Perl. As you can see, the first three
lines of the hello.pl
script are comments, including the "she bang".
Line 5 has the real meat, and the only meat,
of the script. The print
function, as you have guessed, prints the string "Hello World!" to the
screen. If you have dealt with C/C++, Java, or even JavaScript, you may see some
familiar things on this line. One of these is that the "Hello World!"
string is followed by the newline escape character \n.
By using the \n
character the output after the "Hello World!" starts on the next line.
The other familiar thing is the semicolon
(;), which is found at the end of the line. This semicolon is needed after every
line of Perl code.
Perl, like JavaScript (or, perhaps it is
better to say that JavaScript like Perl, . . .) has a very loose variable type
system. This is very different from languages like C/C++ or Java. These
languages require a programmer to declare each variable with a certain type,
like an integer, floating point number, character, string, and so on. Perl only
distinguished between two basic types: scalar data and list data. Scalar data
holds only one thing. List data, on the other hand, can hold many elements.
Here we will look at scalar data and variables (as the header suggests). Scalar
data is split into two basic types: numbers and strings. Numeric data can come
in many forms. Numbers can be in integer or floating point form. Commas are not
allowed in numbers, but you can use underscores to break a number up to make it
easier for viewing. For exponents you use either an uppercase or lowercase 'e'
followed by followed by a positive or negative number. You are not restricted to
just decimal numbers, you can also use octal and hexadecimal numbers. To get
Perl to recognize octal numbers you use a leading zero (0) followed by the octal
digits (0-8). Hexadecimal numbers are proceeded by 0x and then followed by valid
hexadecimal numbers. Here are some valid numbers:
42
52.323
0832
0xf00
3E45
.34e-34
1_200_312
String data can come in two
basic forms (there are more, but we will not get into them now). The first form
is a string within single quotes ('').
All characters bounded by single quotes are read literally, meaning that escape
and other special character are not recognized. Strings bounded by double quotes
("")
will recognize special characters and interpolate most Perl variables (in other
words, incorporates the value of a variable into itself). That is, you could
have this string :
print
'Hello\n'; //in Perl code foobar.pl
$ perl foobar.pl
Hello\n$
//however, if you used
print "Hello\n"; //in Perl code foobar.pl
$ perl foobar.pl
Hello
$
As you can see, in the
first example (using the single quotes), the new line character (\n)
was no recognized as anything special. However, the second example did. If you
would like to see this for your self, then run the hello.pl example that I showed you on the last page. Take note as to what the
output is. After that, change the code so that it used the single quotes instead
of the double quotes.
Scalar variables, which hold scalar data, are denoted by a dollar sign ($)
followed by the variable name. Here is an example of one:
$someVarName;
When naming a scalar
variable there are some things that you need to keep in mind. For one, a scalar
variable must start with either an underscore or a letter. After the first
character, any amount of numbers, letters, and underscores can be used. Other
characters like #, %, ", and so one are reserved for special Perl
variables. Names are case sensitive, so the variable $tucows and $TUCOWS
are different.
Now that we have taken a look at scalar data and how to name a scalar variable,
lets move on to see what operations we can do.
Lets us now get into what Perl can do to its
scalar data. We will be covering these sections:
First,
here is information on the numeric aspect of scalars.
Arithmetic Operators:
As you would guess, you can do all the basic arithmetic operations. Here they
are in this table:
Arithmetic
Operators |
|
Operator |
Description |
+
|
Addition |
-
|
Subtract |
*
|
Multiplication |
/
|
Division |
%
|
Modular Division
(remainder) |
** |
Exponential |
The above operators are pretty strait forward. The only ones that you may not be
too familiar with would be the last two. The %
(Modular division) will return the remainder of a division. For example, 10
% 7 would return 3,
which is the remainder of 10
/ 7.
The last one in the list is the exponent operator (sorry, it does not work like
the ++
and --
operators in this or other languages). This returns the first number raised to
the power of the second number. As an example 10
** 2 would return 100.
Assignment Operators:
Perl also has those handy-dandy shortcut assignment operators that you may have
become accustomed to in other languages. Here is a list of them:
Assignment
Operators |
|
Operator |
Example and
Non-shorthand Equivalent |
+= |
$x
+= 10; # $x = $x + 10; |
-= |
$x
-= 4; # $x = $x - 4; |
*= |
$x
*= 8; # $x = $x * 8; |
/= |
$x
/= 3; # $x = $x / 3; |
%= |
$x
%= 7; # $x = $x % 7; |
**= |
$x
**= 2; # $x = $x ** 2; |
++ |
$x++;
# or --$x; $x = $x + 1; |
-- |
$x--;
# or --$x; $x = $x - 1 |
Increment/Decrements
Operators:
The increment (++)
and decrement (--)
operators are shortcut operators that do as their names' suggest. The ++
will add one to the scalar that it is attached to and --
will subtract one from the scalar. Both of these operators can be placed either
before the variable or after. The placement of the operators does effect to way
that they work. For example, placing the increment operator before the variable
name will force the increment to happen first. As expected, placing the operator
after the variable will cause the increment to occur after everything else. Here
is an example of this:
$foo
= 0;
print ++$foo; #prints 1
$bar = 0;
print $bar++; #prints 0 (increments after)
Numeric
Rounding:
One thing that we have not covered so far with numeric operations is how to
change the precision of a number. The only way to do this is by using the printf and sprintf
functions. These are actually string formatting functions, but Perl can convert
between string and number type scalar data.
If you are familiar with C, then you will recognize these. They work the same as
their C equivalent. These function take a string and where a variable is in need
of formatting you use special notation to indicate what the function is to do.
After the string, you would then pass the variable(s) that the functions would
need to format. Here are some examples:
$foobar
= 23.63542;
printf("%.2f", $foobar);
#prints 23.64
printf("One number: %.0f.
Another: %d", $foobar, $foobar);
#prints "One number: 23. Another: 24"
$foo = sprintf("%d", $foobar);
#returns 23
These two functions are a
lot alike. In fact, the printf
functions calls sprintf to
do the actual work. The difference between these two functions is printf outputs the formatted string to the output device (like the screen),
where sprintf
returns the string so that a scalar variable can catch it.
Above, you are given two basic examples of how to do some string formatting with
these functions. You have the %f
(floating point number) and %d
(decimal number). When using the %f
option, you would place a decimal point followed by a number to indicate the
number of decimal places you want a number to be carried out to. This will round
the number. As you can see, to round the number 23.63542
to the second decimal place (so that it equals 23.64), you would then use %.2f.
If you wanted to round to the nearest integer, then you could use %.0f.
Using %d
on the other hand will take the integer part of a number, it does not round.
You can format more then one number at a time with the printf and sprintf
functions. You would just place all the format notations within the string that
you want. Once the string is finished, you would then, separated by commas,
enter the variables as they come in the string. Here is an example
$alpha
= 1.23456789;
$alpha = 9.87654321;
$foobar = 1.92837465;
printf("%d %.0f %.1f", $ALPHA, $alpha, $foobar);
#prints: "1 10 1.9"
For those of you who are
familiar with C and with printf
you should know that print is more efficient than printf
when writing general information to the output device. You should only use printf
when it is needed and use print the rest of the time.
Perl: Scalar Operations - Strings
Now is the time
to look at some of the operators that work on strings.
Concatenation
and Repetition:
In Perl handles string concatenation differently than other languages, like
C/C++ or Java. Instead of using the ordinary plus operator (+),
you use the dot operator (.).
Here is an example:
someString
= "Hello" + " World!"; //concatenation
//in C/C++
$someString = "Hello" . " World!"; #concatenation
# in Perl
Perl also has a built in
ability to repeat a sting a specified amount of times. To do this, you use the x
operator. This is how it is used in its generic form:
"string" x number
How this works is the "string"
will be repeated number amount of times. Here are some examples:
"ALPHA"
x 3; # "ALPHAALPHAALPHA"
'foobar' x 4; # 'foobarfoobarfoobarfoobar'
123 x 5; # "123123123123123"
The
third example is a bit different then what I told you above. Instead of finding
a string you see the number 123.
What Perl will do in this case is convert the number 123
to a string and then repeat it.
String
Quotes:
So far you know that you can use either the single quotes ('')
or the double quotes ("")
to enclose a string. Also, you know that is you use the single quotes, then
special characters will not be recognized. This feature is good if you need to
use a lot of double quotation marks in your string. The only problem here is,
what if you wanted to use both single and double quotes in your string? You
could just escape the character, but isn't there a better way? Believe it or
not, the answer is yes. Remember when I wrote that there were actually more than
the two ways to define a string (using the single and double quotation marks).
Perl allows you to use the q
operator to define a character to be the string's boundaries. That is, you do
not have to use the single quotation marks if you don't want to, you could use
the pound sign (#)
instead. Here is how the q
operator works:
q#This is
my new string#;
q/Here is a different one/;
As you can see, all you
have to do is use the character that you want to use as your strings boundary
after the q,
that starts the string. To end the string use you special character that you
chose. Using a single q
makes the string behave as if it were defined by single quotes. However, if you
used two qs,
then it would act as if it were defined by double quotes. Here is a string
defined by the double q
method:
qq/Here
is a string/;
Now that we have seen some
of the basic number and string operations, lets look at the subject of
conversions.
Scalar conversions, here is a topic that is
not all that trivial. There will always be times that you calculate a number,
but then need it in string form. Or worse yet, you get a number in string form
but need it converted to numeric form so that you can manipulate it. With Perl,
we are fortunate enough to have a languages that will do most of the
string-to-number and number-to-string conversions for us. You have seen this
already when we covered the x
operator for string repetitions. The example that illustrated this was 123
x 5;, which would produce the string
"123123123123123".
Perl will convert between strings and numbers on the fly. You do not have to
worry about adding a string with a number, you just do it. Here is an example:
5
+ "23" #equals 28
Perl will also try to
convert strings into numbers when it would seem that it should not. Here is an
example of what is meant by this:
2
* "foobar"; #converts the string to zero
5 - "ALPHAZONE loves you"; #'2' is converted to a number
#and the rest of the string is dropped off
When Perl cannot make any
sense of a string to number conversion, then the string will convert to zero.
Therefore, the first example would equate to 0. The
second example has a number followed by non-digit characters. In this case, 2
is converted to a number and the rest of the characters are dropped. The second
example would then evaluate to 3.
If you have Perl's warnings setting turned on (with the -w option) then Perl
will warn you that you are trying to do this sort of conversion.
Here, why don't we see what kind of warning Perl will give us. This will
(hopefully) make you used to warnings later on when you do not expect them. Copy
this code into test.pl
(or any file you wish) and then run it with warnings turned on:
1.
#! /usr/bin/perl -w
2.
3.
$testVar = 2 + "foobar";
4.
print "$testVar\n";
Now that you ran test.pl, try and change it to use the other example.
There is one exception to Perl's automatic conversion. When you have a octal or
a hexadecimal number in string form, Perl will not recognize that they are
supposed to be of a different number system than the default decimal system
(base 10). This would meant that the expression 10
+ '0xff'; would evaluate to 10 instead
of the expected 265.
This happens the Perl interpreter sees the zero, and then the x
character. Since the x
is not a digit, the rest of the string would be dropped off. Perl would then be
adding 10
and 0
instead of 10
the hex number ff. To
force Perl to recognize an octal number or a hexadecimal number that is found in
a string, you will have to use the oct
function. Here is how it works:
10
+ oct '0xff'; #this will return 265
15 + oct '046'; #oct will also convert octal numbers
#this will return 48
You could use the hex
function the same way that you use the oct
function, but it will only convert hexadecimal numbers.
Perl has scalar
comparisons built into the language (as would be expected). It has a way of
comparing if one number is bigger than another, or if two strings are equal.
These things should not be a surprise, you can do those in just about any
programming/scripting language.
Perl does do something different with comparisons, however. When comparing
numeric scalar data, you would use one set of comparison operators. When you
compare string data, then you would use a completely different set of operators.
This is done so that Perl will know how to compare the data. For example, say
you have the strings "8" and "255".
Now, which is bigger? To answer this question we must take a look at this from
two views. First, if we look at it from a numeric perspective we could use this
test: "255" > "8".
The out come would be true, since the number 255
is greater than '8'.
Now, let us look at this from a string point of view. We would use this test: "255"
gt "8". This time, the
test would return false, since the character 8
is greater than the character '2'.
Perl allows you to do the same types of comparisons with each type of scalar
data. Here is a list of both sets of comparison operators with descriptions:
Comparison
Operators |
||
Numeric Operator |
String Operator |
Description |
== |
eq |
Equal |
!= |
ne |
Not Equal |
< |
lt |
Less Than |
> |
gt |
Greater Than |
<= |
le |
Less Than or Equal To
|
>= |
ge |
Greater Than or Equal
To |
Like the comparison operators, Perl also has two sets of logic operators. Unlike
the comparison operators, however, the two set work the same. There are two only
to give Perl a wider set of options. Here is a table of the different logical
operators:
Logic
Operators |
||
C style |
Perl style |
Description |
&&
|
and |
logical AND |
|| |
or |
logical OR |
!
|
not |
logical NOT |
These operators generally will work as you expect them too. When you use the
local AND (&&,
and), both statements on either side
of the operator must be true for the entire statement to be true. That is,
#Note:
1 = true, 0 = false
0 && 1;
#--and--
1 and 0;
would both return false,
since a false (0)
statement was found in both of them. The logical OR (||,
or) will return true if at least one
of the operands is true. All three of these statements would return true:
#Note:
1 = true, 0 = false
1 || 1;
0 or 1;
1 || 0;
The logical NOT (!)
returns the opposite truth value of its operand. This is how the logical NOT can
be used:
!1;
#returns false
not0 #returns true
not(0); #returns true
The logical AND and OR are
both "short circuiting" operators. What "short circuiting"
means is that both operators will do the least amount of work to get the job
done. In other words, when using the logical AND operator, if the first operand
returns false then Perl will not bother testing the second operand, why should
it. The same goes for the logical OR, accept if the first operand is true, then
Perl will move on without checking the truth value of the second argument.
Knowing this, you can use short circuiting to you advantage. Here is one example
of how you can use this short circuiting:
open(IN,
$fileName) || die "file not found";
The open command is used to open a file. In the above example, if the open command fails (meaning that the file could not be opened) then the
second operand of the ||
will then be evaluated (which will end the script and print the given string).
However, if the file does open, then the open statement will return true and the logical OR will short circuit without
calling the die
statement. Confused yet?? If so, no need to worry, things will become clearer as
you work more with Perl.
Perl: List Data - Introduction
uses list (arrays and hashes).
First of all, lists are placed within parentheses with each element separated by
commas. Next, lists in Perl are not restricted to certain types. That is, lists
can be made up of numbers, strings, expressions that evaluate into a scalar
form, or even a mixture of all of the above. Here are some examples:
(1,
2, 3, 4, 5)
("a", "b", "c")
("ALPHAZONE", 'HTML Stuff', $a++, 1)
Lists can also have other
lists within them. These sub-lists just become part of the bigger list, however:
(1,
2, 3, ("one", "two", "three"), 4)
#results in (1, 2, 3, "one", "two", "three",
4)
(1, 2, 3, @someArray, 4)
#the contents of the array @someArray will
#be placed within the list
An empty list is just a set
of empty parenthesis ().
You can use lists to do variable assignments. Here is an example:
($x,
$y) = (1, 2);
#same as $x = 1;
# $y = 2;
These kind of variable
assignments are done in parallel, so this kind of statement is okay:
($x,
$y) = ($y, $x);
#same as $temp = $x;
# $x = $y;
# $y = $temp;
As you can see, the above
assignment saves space and it is easier to read. You also include other list to
be assigned (like arrays). We will look into that in the next section. Come to
think of it, it is time to move on now.
Arrays are a way
to store list data in a sequential order. You will find arrays in just about any
programming and scripting language. They are just that useful. Perl arrays are
very versatile and can be used in many ways. Here you will take a look at some
of the basics on Perl arrays.
Array
Creation:
Perl arrays are pretty flexible. When creating an array, you do not have to
assign a certain type for every element to be restricted to. An array can hold
all strings, number, or a combination of both. Also, you do not have to set a
size for the array to be. In fact, Perl arrays can shrink and grow as you need
them to. All you have to do to create an array is to set it equal to a list or
another array. Here are some examples:
@emptyArray
= (); #an empty list
@someArray = ("Hello", "from", "ALPHAZONE");
@someOtherArray = @someArray;
#@someOtherArray holds the
#information as @someArray
@mixedArray = ("one", 2, "three", 4);
#array of mixed types
@reallyMixedArray = (@mixedArray, ("five", 6, "seven"),
8);
#this array will be equal to:
# [ "one" | 2 |
"three" |
# 4 |
"five" | 6 | "seven" | 8 ]
As
you can see creating arrays is not very hard. Also, you may realized that array
variable names are proceeded with the "at" symbol (@).
The variable names follow the same rule that scalar variables name do. You can
start a variable name with either an underscore (_) or a letter. After that
first character in the name you can then use more underscores and letters and
also numbers. Array variable names are not the same as scalar variable names.
That means that you can give an array a name and use the same one for a scalar
variable, Perl will know the difference. (This will also work for hash variable
names as well.)
Accessing
Array Elements:
To access elements in an array you would use this notation:
$arrayName[elementPosition];
This should look a little
strange to you. You see that the dollar sign ($)
is used, but didn't I just tell you that array names are proceeded by the
"at" symbol (@)?
Well, yes, but that was when you are dealing with the entire array. When you are
working with only one element in the array, then you are dealing with scalar
data. (Remember that scalar data hold only one thing, as does an array element.)
Just remember that when you are referring to the entire array (or at least more
than just one element of it) you will use the "at" symbol. On the
other hand, if you are referring to one specific element within the array then
you will use the dollar sign.
Now, back to the element call. The square brackets ([])
go around the elements position within the array, or the index. This index in an
integer number and Perl arrays start indexing at zero. That is, the first
element in the array is at index 0, the second is at index 1, the third at 2,
and so on. Here is how we would get to the second element within the @someArray:
@someArray
= ("Hello", "from", "ALPHAZONE");
$someScalar = $someArray[1];
#$someScalar would equal "from"
You can change values at
certain indexes using this sort of notation:
$someArray[2]
= "us";
#now the array holds:
# [ "Hello" |
"from" | "us" ]
Making
Arrays Grow:
You can also use the above notation of element access to make the array grow in
size. This is done by setting an array elements that is beyond its current size.
With $someArray
its size only three elements and the last element is at the index of 2. We could
set that array element to some value and the array would grow. Here is how this
would work:
#using
the @someArray from above
$someArray[3] = "all";
#now the array holds:
# [ "Hello" |
"from" | "us" | "all" ]
You could also make the
array grow more drastically. Instead of adding one element the the next position
in the array we could have moved further, like to the index of 10. What this
would do is create all the needed elements up to the 10th index. All
the elements that were not given values would then have a value of undefined. Here is an example:
#again,
using the above @someArray
@someArray[10] = "something";
#now, elements indexed 4 through 9
#hold a value of "undefined"
Array
Size and the Last Element:
When we refer to the "size" of an array, we mean the number of
elements an array has. This can be by setting a scalar variable to and array
variable. Here is how we could do it:
$scalarVar
= @arrayVar;
It is that simple. Odd
thing of this is how it works. Why does setting a scalar variable (capable of
holding only one piece of data) equal to an array variable (a list which can
hold many, many elements) work in the first place? And why would it return the
size of the array, why not the contents? You do not have to worry about these
questions. The assignment may look strange, but it is just the way it is.
Now, about the last element of an array. Knowing that the first element in an
array in found at the index of zero, you could then conclude that the last
element in the array would be one less than the array size. This is correct
thinking, but there is an easier way. This is how you get the index of the last
element of an array:
$#arrayName;
If you would like to access
the last element of an element, then you could do this:
$arrayName[$#arrayName];
There is also an easier way
to do this. You could use negative indexing. Yup, you herd me right, negative
indexing. The last element can be found an the index of -1, the second to last
at -2, and so on.
Perl: List Data - More on Arrays
Now that we have
looked at some of the basics of Perl arrays, lets see what else we can do with
them.
The
Range Operator
One of the basic things that you may want to do with arrays is store a bunch of
sequential data. "But, that is what an array is," you may be thinking.
This is now exactly what I am saying. What if you wanted to store that actual
numbers 1 through 100, one number for each element. You could, of course, enter
every element manually:
@someArray
= (1, 2, 3, 4, 5, 6, 7, 8, . . ., 100);
It would be crazy to do
this. Maybe you could use a loop, something like this:
for($i
= 1; $i <= 100; $i++) {
&nsp; $someArray[$i - 1] = $i;
}
Don't worry about knowing
how to do loops yet, we will get to those soon enough. This loop was not too
bad, but there is still an easier way. You can use what is called the range
operator. This is how you could use the range operator to fill an array with the
values 1 to 100:
@someArray
= (1 .. 100);
You could also do this with
letters:
@someArray
= ('a' .. 'z');
You could use more than one
range at a time as well. For instance, would could fill an array with all the
lowercase letters and then the uppercase letters. All you will need to do is
separate the ranges by commas:
@someArray
= ('a'..'z', 'A'..'Z');
The
Quote Word Function
There is another way to help you build your arrays in Perl. You can use the
quote work function, or qw.
You would use this function if you were filling an array with a bunch of single
word strings. What the qw
function does is allows you to make a list without having to add all the
quotation marks and commas. Here is an example:
#Standard
way:
@colorsArray = ("red", "green", "blue",
"magenta", "yellow", "cyan");
#qw way:
@colorsArray = qw(red green
blue magenta
yellow cyan);
As
you can see, this is easier to write out and even read.
Slices
Now, remember in the last section on the array basics when I mentioned using
parts of arrays in a list context? If you don't that is okey, but you can. You
can access a portion of a list of two or more elements of a list. These portions
of list are known as "slices." Here is how you would call a slice:
@arrayName[index1,
index2, [index3, . . ., indexn]];
Please note, the indexes in
the red, italicized square brackets ([])
are optional indexes. Those brackets are not needed. Since a slice is a list (or
sub-list) the "at" symbol is used. Let's see how a slice could be
used:
#using
the @colorsArray from above:
print "@colorsArray[1, 2, 3]"; #prints: "green blue
magenta"
@colorsArray[1, 5, 0] = ("white", "black",
"gray");
#@colorsArray will now equal:
# [ "gray" |
"white" | "blue" |
#
"magenta" | "yellow" | "black" ]
As you can see the indexes
do not have to come in order. You can also use the range operator when dealing
with slices. It works as you would expect:
@someArray[1..5];
#same as: @someArray[1, 2, 3, 4, 5]
Traversing
Arrays
There will come a time that you would naturally like to visit every element of
an array and do something to it. Weather it is a simple search or a printing of
the array, you will need to get to every element. One way that you could do this
traversing is to use a for
loop (which we will get into later), but Perl has something that is a little
easier. Perl has the foreach loop. It is designed for traversing arrays and so makes the process
simple. Here is the syntax:
foreach $someScalar (@someArray) {
. . .
}
The scalar variable, $someScalar,
is a temporary variable which will be set to an element of the array, @someArray,
at each pass of the loop. If you wanted to print every element of an array, then
you could use the foreach loop like this:
@someArray
= ('a'..'z');
foreach $element (@someArray) {
print "$element\n";
}
Array
and List Sorting
Perl has its own built-in sort function to make life easier for you. This is
really quite simple, here is an example of a sort:
@someArray
= sort @sortMe;
The array, @someArray, will hold the sorted array and @sortMe
will be left unchanged.
This sort will order the array by ASCII value. That is, if the array @sortMe is full of numbers, then you may find that your sort will return
something unexpected. For example, if had the numbers 546 and 91, then 546 would
come first in the list. This is because the number '5' comes before '9' in the
ASCII character set. If you would like to sort by numeric value, then you will
want to use the <=>
operator. Here is the syntax of a numerical sort:
@sortedArray
= sort {$a <=> $b} @numsArray;
It is important that you
use the variables $a
and $a are needed. That
is, this sort will give you an error:
@sortedArray
= sort {$foo <=> $bar}
@numsArray;
List
Assignments
We briefly looked at using lists to assign values to variables. For example, you
can use lists to do a simple swap, or just a simple assignment:
($x,
$y) = ($y, $x); #the values have been swapped
($a, $b, $c) = (1, 2, 3);
In the above example I used
only scalar variables, but you can also use arrays. Here is an example:
($scalarA,
$scalarB, $arrayA) = (1, 2, 3, 4, 5);
#$scalarA will equal 1
#$scalarB will equal 2
#@arrayA will equal [ 3 | 4 | 5 ]
As you can see, the array
took all that was left in the list. Arrays are greedy variables so be careful
where you place them within the list. For example, if the array is placed within
the middle of the list then it would leave nothing of the following variables:
($scalarA,
@arrayA, $scalarB) = (1, 2, 3, 4, 5);
#$scalarA will equal 1
#@arrayA will equal [ 2 | 3 | 4 | 5 ]
#$scalarB will be undefined
There is one more point
that we will look at with arrays at this point. This point to be covered reading
and writing to and from arrays. This will be found in the next section.
Perl: List Data - Input and Output with Arrays
We have gone
over some of the basic inputting and outputting with the uses of scalar
variables. Arrays, however, follow some different rules than scalars do. This is
why we revisit the standard input and output.
Printing
Arrays
If a simple printing of an array is all you are looking for, then you could just
use the print
function. Printing arrays is pretty simple. As you would guess, you can print a
single element of an array and it would act as if you were to print a scalar
variable. If, however, you were to print the entire array, or a slice of one,
then you will come up to some differences. First, you can print an array like
this:
@foo
= (1..10);
print @foo; #this would print: "12345678910"
As you can see, this is not
very reader friendly. However, if you were to put the array within double
quotes, then the array will be printed in a better format:
@bar
= (1..10);
print "@bar";
#this would print: "1 2 3 4 5 6 7 8 9 10";
#here is a slice:
print "@bar[2..5]"; #prints: "3 4 5 6"
As you can see, Perl will
separate each element in the array with a blank space. If this is not to your
liking, however, you could always change the value of one of Perl's many
reserved variables $".
This variable is given the value of a blank space by default and is used in
between each element when the array in printed. Here is how this works:
@foobar
= (1..6);
$" = " | ";
print "[ @foobar ]";
#prints: "[ 1 | 2 | 3 | 4 | 5 | 6 ]"
You would use the same idea
for printing arrays that you would to dump an entire array into one scalar
variable. That is, you would set a scalar variable equal to an array that was
placed within double quotation marks. The same rule about the $" variable applies here as it did with printing.
There are two more special variables that Perl has that help control that way
that arrays are displayed. The first of these two is the $,.
This works like the $" variable, but it works on arrays that are not interpolated with the
double quotation marks. $, has a default value
of empty. Here is one way that this variable can be used:
$,
= " : ";
@someArray = (1..5);
print @someArray; #prints "1 : 2 : 3 : 4 : 5";
Perl also has the $\.
This variable controls what is printed after an array. Like the $,
variable, $\ has a default value
of empty. Here is an example of how this can be used:
$\
= " the end";
@anArray = (1..5);
print @anArray; #prints "12345 the end"
print "@anArray"; #prints "1 2 3 4 5 the end"
Arrays
and Input
Here is something that you will not find in many languages, reading input into
an array. As we have seen, arrays are greedy and this attribute carries over to
reading input from an external source, like the standard input stream. If you
were to have this call:
@anArray
= <>;
the array @anArray will read data until the end of data (or end of file) has been reached.
In the case of standard input you should be able to indicate an end of input
with a Ctrl + D
under UNIX and LINUX or a Ctrl
+ Z under Windows.
This greediness of arrays can be very useful. You can read in entire files
(hopefully the files are not too big) at a time and then worry about processing
that information afterwards. Even thought you do not know how to open and read
from files yet, you can simulate this at the command line. Remember that you can
redirect the contents of a file into the standard input. Here is a quick
example:
~$
someScript.pl < someInputFile
When the end of the file
has been reached then the reading in from input would be stopped.
Now that we have had our fun with arrays it is time to look at hashes. Hashes,
though they are forms of list data, are very different from arrays. You will see
this in the next section.
Perl: List Data - Hashes
When I hear the work "hash" I think
of Perl. Yes, that is right, I think of the list type variable that holds
information referenced to by keys instead of indexes. Even though arrays and
hashes are related to each other, in that they both contain list type data,
hashes behave very differently.
A hash is kind of like a big pool of data. Each data element is attached to a
key. These keys can be a words or number. The information, like in a pool, does
not really come in any real order. Here is an illustration of the idea of a
hash:
As
you can see, hashes are a little more disjointed than arrays, which have a nice
sequential order.
The great thing about hashes is that you can use descriptive names to indicate
where a value is. That is, say you had a group of people, each a different age.
You could store the people's ages at elements that have their names as the keys.
Creating
Hashes
To create hashes you use the percent symbol (%). As with
arrays, you can set a hash variable equal to a list to fill it with content.
Here is and example:
%hashVar
= ("one", 1, "two", 2, "three", 3);
Unlike
arrays, however, this does not create a hash that has six elements. It creates a
hash with only three elements. The first item in the list "one"
is a "key" which is attached to the value of 1.
The item "two"
and "three"
are also "keys" with 2
and 3
their respective values. As you can see, key and their values are paired off in
the list assignment, but what if the list has an odd number of items? The last
key will then get an undefined value.
Accessing
Hash Elements
Again, like arrays if you want to access an element in a hash you will be using
the hash in a scalar context. Because of this you will have to use the dollar
sign instead of the percent symbol. You will use a set of curly brackets to
enclose the key of the element that you want to access. Here is an example:
%someHash
= ("one", "1cow", "two", "2cows",
"three", "3cows");
print $someHash{"two"};
#prints "2cows"
You can, of course, also
use a variable in place of the literal string:
$someVar =
"two";
%someHash = ("one", "1cow", "two",
"2cows");
print $someHash{$someVar}; #prints
"2cows"
If
you were to try to access and use an undefined key, then you will get a warning
that the variable is undefined.
With hashes, you will not be able to access more than one element at a time like
you can do with array slices.
Making
Hashes Grow
As with arrays, you can make hashes grow easily. All you have to do is use a key
that does not exist and set a value to that element. To make sure that that slot
is empty you will have to do a test and for this sort of test Perl provides you
with the exists
function. Here is an example:
%fooHash
= ("one", 1, "two", 2);
if(exists $fooHash{"three"}) {
$fooHash{"three"} = 3;
}
Don't
worry about the if
statement, you will learn about it later. But, since we are on the subject, just
remember that the number zero and an empty set of quotation marks will make the
conditional statement, like and undefined variables. You will want to keep this
in mind if you are doing a test like this.
Making
Hashes Shrink
Just as you can add new elements with new keys to a hash you can take them away.
To do this you will have to use the delete
function. This will delete the element you specify from the hash (both the key
and value) and returns the value that was deleted. You can use this to move one
element form hash "a" over to hash "b." Here is how this
works:
$hashB{$someKey}
= delete $hashA{$someKey};
The general functionality
of hashes is limited compared to arrays. You will see some of the restrictions
that hashes have upon them in the next section.
Perl: List Data - More on Hashes
Perl hashes can
be very valuable and you may come to depend upon them, but first there are some
more that you will need to know then just the basics.
Hash
Assignment Using Pairing:
There is an alternate way to assign a list of data to a hash. You will not need
to separate each part of the list up with commas, separating both the key and
its value. Instead you can keep key/value pairs together. Here is the code:
%hashPairs
= (one=>1, two=>2,
three=>3);
The
keys are on the left side and the values are on the right. As you can see, I did
not place the keys within quotation marks. This can be done because the keys
that are used are simple, single work keys.
Hash
and Array Conversion:
Arrays and literal lists often work in similar ways, in fact, arrays and literal
lists can take each other's places in many cases. That is, when a list is
expected in a Perl script, you can often simply use an array instead. This
property of arrays carries over to hash assignments. Here is an example:
@array
= ("one", 1, "two", 2, "three", 3);
%hash = @array;
#--This is the same as:
%hash = ("one", 1, "two", 2, "three", 3);
Taken in the opposite
direction you will have somewhat similar results. You will find that you can set
an array equal to a hash. The hash will be "unwound" and its elements
will be split up into their two basic parts, the key and value. The parts will
be placed within the array as separate elements. Also, remember how I said that
hashes are like pools of information? The basic idea of that statement was that
a hash will unwind its elements in no particular order. Here is one possible
example:
%hash
= (one=>1, two=>2, three=>3);
@array = %hash;
#this array may be equal to:
# [ "three" | 3 | "two" | 2 |
"one" | 1 ]
Traversing
a Hash:
When you wanted to traverse an array I showed you that you could use the foreach loop. This loop expects an array (or list) as one of its arguments. If
you decided to try the same idea on hashes, then Perl would unwind the hash into
its basic list components like it did when you tried to covert the hash to an
array. Who knows, this may be what you want, but in most cases you will only be
concerned with the values that are stored with in the hash. Because of this need
Perl has the keys
and values
functions. As you would expect, the keys function will extract the keys from a hash and the values function would extract the values. Both of these functions return a list
of there respective values. If you only wanted to do something simple to each
value of a hash, like printing them, then you would want to use the values function. Here is how you could print all the values of a hash:
%someHash
= (one=>1, two=>2, three=>3);
foreach $value (values %someHash) {
print $value;
}
If you wanted the values to
be sorted, say by number, then you could do something like this:
%someHash
= (one=>1, two=>2, three=>3);
foreach $value (sort {$a <=> $b} values %someHash) {
print $value;
}
On the other hand, there
will be times that you will want to change every value of a hash. In this case
you will want to use the keys
function. (This is because the values function only makes a copy of the values of the hash. Changing these
copies will not affect the original.) Here how you can change the values of a
hash:
%someHash
= (one=>1, two=>2, three=>3);
foreach $key (keys %someHash) {
$someHash{$key}++;
}
(You
could also use this to do the simple stuff like the printing of values, of
course.)
Other
Hash Issues:
When you print a hash, you do not get all the different options that you were
given with arrays. Pretty much all you can use is the print command with a simple hash:
print
%someHash;
The hash will unwind and
then be printed as if it were unwound into an array and then printed. You are
not allow to interpolate a hash within a string and so this will give you a
warning:
print
"%someHash";
You are pretty much forced
to use the values
of keys
functions with the foreach loop if you wanted to print a hash out in a friendly format.
Now, how would you find the number of elements that a hash has? If you use past
experience you may say just set a scalar variable to equal a hash, like this:
$length
= %myHash;
This, however will not give
you the length of the hash. It will, instead, give you the "state"
that the hash is in, probably not what you would like. The simplest way to find
the number of elements in a hash is to use the keys function and find the length of the array that is returned:
$length
= keys %someHash;
Perl: Conditional Statements
Perl, of course,
has conditional statements. Why wouldn't it? You cannot really do too much
without conditional statements. If you did not have them, then you would have
pretty boring scripts.
The if and if/else
Statements
This is what a general if
looks like:
if
(condition) {
#. . . code to be executed if true
}
This is not too difficult.
We have already looked at how you can compare scalar variables, but we will take
some time here to review. Also, we will look at some other ways to get a truth
value out of Perl. If you would like, you can review the lesson that looked at
comparison and logical operators or you can just take a look at the comparison
and logic operator table listing.
As you may remember, there are two sets of comparison and logic operators. For
the comparison operators the two sets determine the type of comparison you will
be doing. The symbolic operators (like == and <) numeric comparison, where the alphabetic operators (like eq
and lt)
are used to compare strings.
You can, of course, compare more than one set of operands by splitting the
conditions up with the logic operators. The two sets that Perl supports, the
symbolic (like &&)
and the alphabetic (like and)
type logical operators. Unlike with the conditional operators, however, these
two sets do the exact same things. When using the AND and the OR logic operators
you should also remember that they are short circuiting. That is, if the first
condition is true then the OR operator will automatically return true without
evaluating the second condition. Likewise, if the first condition is false with
the AND operator, false will be returned without evaluating the second
condition. With this in mind, you may want to place the more likely condition
first.
When working with Perl you will not always have a condition that will need
either the comparison operators or logic operators. You can use actual variables
for testing. Here are the basic things that will return a false value:
You could then use a
variable as the conditions statement:
if($someVar)
{
#some block of code
}
If the variable holds a
value of zero, an empty string, or no value at all, then the condition will
return false. There are times that you will want to accept the zeros and empty
strings, leaving only the undefined value to return false. For this you can use
the defined
function. This works a lot like the exists function for hashes. The defined
function will return a value of false only if the variable it is passed is
undefined. Here is an example:
if(defined
$someVar) {
#some block of code
}
Now back to the if
statement. All that really is left to be said about the if
statement is that it must have the curly brackets, even if one line of code will
be evaluated if it is true. Here are some examples of if
statements:
if($a
== $b) { #then do something. . . }
if($foo == "foobar") {
print '$foo is equal to "foobar"',"\n";
}
if(!"" || "") {
print "This should print";
}
if( $someArray[3]
&& ($someArray != 0
|| $someArray ne "") ) {
$someArray[3] += 53;
print "\aSome thing has been done\n";
}
If you would like to also
do something if the condition ended up returning false, then you would use the else command. Like with if,
the else
must be followed with a set of curly brackets regardless of what is to be
executed. Here are some examples:
if($conditionTrue)
{
print "If true, then this is printed";
} else {
print "If false, then this is printed";
}
if("") {
print "This will not print";
}
else {
print "This will print";
}
If you wanted to do another
test at the else
part of the statement, then you can use the elsif command (this is not a typo). Here is and example:
if($someVar
== 1) {
print $someThing;
} elsif($someVar == 2) {
print $someOtherThing;
} else {
print $somethingElse;
}
Of course, if using the elsif is not to your liking, then you can use just nest your if/else statements within each other. Here is the above example rewritten with
nested if/else
statements:
if($someVar
== 1) {
print $someThing;
} else {
if($someVar == 2) {
print $someOtherThing;
} else {
print $somethingElse;
}
}
(Note:
Perl does not have a switch function like some other languages do, however you
can easily use a series of if.
. . elsif. . .elsif. . . else
statements to get a similar effect.)
The unless
Statement
There are times you are only concerned if a condition is false. You can just use
the NOT operator or equivalent, but there is also a different option. Perl also
has the unless
command. This works just as an if
statement would, but it will evaluate its block of code if its condition is
false. Here is and example:
unless($someVar
== 1) {
print '$someVar did not equal 1', "\n";
}
#--This is the same as:
if(!($someVar == 1)) {
print '$someVar did not equal 1', "\n";
}
#--and this:
if($someVar != 1) {
print '$someVar did not equal 1', "\n";
}
The
Conditional Operator
Perl has its very own conditional operator. For simple tests this operator can
save you some extra coding. Here is what the conditional operator looks like:
condition
? trueThing :
falseThing;
This operator is often used
for a quick test and assign. Here is an example:
$someVar
= $testFlag == 1 ? "http://www.tucows.com"
: "some other site";
This bit of code is the
same as writing this out:
if($testFlag
== 1) {
$someVar = "http://www.tucows.com";
} else {
$someVar = "some other site";
}
Perl: Loops
The close
relatives of conditional statements are loops. Perl was a number of different
loops to fit any of your reoccurrence needs.
The while
and until
Loops:
One of the simplest loops is the while
loop. It basically works just like an if,
where you execute a block of code while a condition is true. Here is the basic
form of the while loop:
while(condition) {
# block of code to be executed
}
Also like the if
statements you will find that the while
loop, as with all other Perl loops, must be blocked off with a set of curly
brackets. The condition, of course, can be any Perl expression that can give a
truth value. You can leave the conditional test out to make an infinite loop.
while
loops will check the condition even before the loop starts and so it has the
potential to never running at all. Here is an example:
while(0)
{
#some block of code
}
Since the condition will
return a false value the loop will never get a chance to do anything.
The until
loop is much like the unless statement. The until will only run as long as its condition is false, or
runs "until" the condition is true. That is, these two loops will do
the same thing:
while($thisCondition
!= $true) {
#do something
}
until($thisCondition == $true) {
#do something
}
The do.
. .while and
do. . .until
do
loops function like their while and until
counter parts. The main difference you will find (other than the actual syntax)
is that do loops
will run the block of code and then check if the condition is true. Because of
this, do
loops will always run at least one time, regardless of the condition. Here is
how you would write some do loops:
do
{
#do something
} while($this == $works);
do {
#do something
} until($that == $fails);
You
will see that the while
and the until
parts come after the block. Also, make not of the semicolons (;)
that follow the do loops.
This is needed.
The for and foreach
Loops
The for
loop is different from the other loops. It can take more than on parameter and
is generally used in fairly well defined situations. The for
loop takes three parameters, these are the counter variable initialization, the
test, and the increment section. Here is the syntax of the for
loop:
for(init;
test; increment) {
#do some stuff
}
One common use of the for
loop is to traverse an array. Yes, you could just use the foreach loop, but it is only for the times that you do not want to change any of
the values within the array. Here is how you could use a for loop to change each
value of an array:
for($i
= 0; $i < @array; $i++) {
$array[$i] += 2;
}
This basically says that
you want to start at the beginning of the array ($i = 0) and
run until you have reached the end of the array ($i > @array).
At each index of the array, you will add two to the element ($array[$i]
+= 2). Next, change to the next
index and make sure that the index is still in the array. If it is, then repeat
the process.
The for
loop can be done in with a while loop, but it will not be as compact. Here is an example:
$i
= 0;
while($i < @array) {
$array[$i] += 2;
$i++;
}
You can have extra
parameters in the first and third sections of the for
loop. You would just separate these extra parameters with commas:
for($i
= 0, $j = $#array; $i < $j; $i++,
$j--) {
#do something
}
You can leave some or all
of the parameters of the for
loop out, you will however have to leave the semicolons in:
for($i
= 0; ; $i++) {
#do something
}
for( ; ; ) {
#do something
}
The foreach is the loop that have the array in mind. If you do not need to change
any of the array's values, but still want to visit every element in the array,
then you will want to use this loop. As you have see the foreach loop will make it so that you will not have to keep track of a counter
and you can use a simple, descriptive variable name to refer to the element.
Again, here is what the foreach
loop looks like:
foreach $elementVar (@arrayVar) {
#do something with $elementVar
}
The variable $element is the temporary variable that the loop uses to refer to the array
element. The variable $arrayVar
is, of course, the array that you are to traverse. I know, we have already gone
through all of this stuff about the foreach loop when we covered arrays, but there are still some little extras that
we have not covered.
Remember that arrays and literal lists are practically interchangeable. Because
of this you can just use a list in place of the array variable:
foreach
$someVar (1, 2, 3, 4, 5) {
#do something
}
#--or--
foreach $someVar (1..5) {
#do something
}
As you can see, you can
even use the range operator to make your list.
The element variable in the foreach
loop call is an optional parameter. If you leave this variable out, then Perl
will revert to using the special variable $_.
Here is an example:
foreach
(@array) {
print $_; }
The variable $_
is used by many Perl functions when a parameter is not passed. You will come in
contact with this variable many times. In fact, the print function will revert to the $_
variable. With this you can make the above foreach loop even simpler:
foreach
(@array) {
print;
}
#--or this, for those of you who would
# like to save even more space
foreach (@array) { print; }
Now that we have seen all
the loops, lets move to the next section, controlling loops
Perl: Controlling Loops
There will be
times in your Perl scripting lives that you just want to get out of a loop
before you have defined the loop to stop. What if you were doing a search on an
array and you were only concerned about the first time you come across the
search string? You would not want to have to finish going through the array when
it is not necessary. Because of this need Perl has given you loop controls.
last,
next, and redo:
last, next,
and redo,
here you have your basic loop controls. These loop controls will only work on while,
for,
and foreach
loops (sorry do loops).
The last
command will end the loop completely. The next command will stop the execution of the current iteration and then jump
to the next iteration. The redo
command will also stop the current iteration, but it will then come back and
redo it again. Here are some examples of how these command behave:
#--last
foreach (1..5) {
print "$_ ";
if($_ == 3) { last;
}
}
#the output:
# "1 2 3 "
#--next
foreach (1..5) {
if($_ == 3) { next;
}
print "$_ ";
}
#the output:
# "1 2 4 5 "
#--redo
$redoDone = 0; #only want to redo once
foreach (1..5) {
print "$_ ";
if($_ == 3 && !$redoDone) {
$redoDone = 1;
redo;
}
}
#the output:
# "1 2 3 3 4 5"
Loop
Labeling
As you can see, these loop controls can be very helpful, but they are still
limited to the current loop. For this limitation Perl allows you to label loops.
To label a loop all you have to do is place the label name (traditionally this
name is in all uppercase letters) followed by a colon (:)
and the loop call. Here is an example:
MY_LOOP: while($someCondition) {
#do something
}
Now, if you would like to
exit the loop labeled MY_LOOP then you will
just use the last
command followed by the loop label. Here is an example:
MY_LABEL: while(@someCondition) {
if($someCondition eq "quit") {
last MY_LABEL;
}
#else, do some other stuff
}
You can do this with the
other commands as well. The above example is not really very useful, you already
know how to exit out of the current loop. Perhaps, this is a better example:
MAIN: for($i = 0; ; $i++) {
print "$i ";
for($j = 0; $j < 5 ; $j++) {
if(($i + $j) == 10) {
last
MAIN;
}
}
}
#This would output:
# "0 1 2 3 4 5 6"
This is a fairly simple
example. As you can see the outer loop is infinite, but the inner loop has a way
to exit the outer loop.
Now that you have loops and how to control them under your belt, you are set to
get out there and start to get your teeth into some of the cooler parts of Perl.