Each output line consists of a list of words

an article added by: Jorge Martinez at 04262008


In: Categories » Internet and online » Web services » Each output line consists of a list of words

These lines have to be sorted using an alphabetic ordering that uses the sub-string starting at the keyword. The keyword starts after column 50, so we require a special sort helper routine that picks out these sub-strings. The sort routine is similar to the numeric_sort illustrated earlier. It relies on the convention that, before the routine is called, the global variables $a and $b will have been assigned the two data elements (in this case report lines) that must be compared.

sub by_keystr {
   my $str1 = substr($a,50);
   my $str2 = substr($b,50);
   if($str1 lt $str2) { return -1; }
   elsif($str1 eq $str2) { return 0; }
   else { return 1; }
   }

This subroutine requires local variables to store the two sub-strings. Perl permits the declaration of variables whose scope is limited to the body of a function (or, scoped to an inner block in which they are declared). These variables are declared with the keyword my; here the sort helper function has two local variables $str1 and $str2. These contain the sub-strings starting at position 50 from the two generated lines. The lt and eq comparisons done on these strings could be simplified using Perl’s cmp operator (it is a string version of the <=> operator mentioned in the context of the numeric sort helper function). The body of the main while loop works by splitting the input line into a list of words and then processing this list.

while($title = ) {
   chomp($title);
   @Title = split / / , $title;
   ...
   foreach $i (0 .. $#Title) {
   $Word = $Title[$i];
   ... }
   }

Each word must be tested to determine whether it is a keyword. This can be done using a simple regular expression match. The pattern in this regular expression specifies that there must be an upper-case letter at the beginning of the string held in $Word: if($Word =~ /^[A-Z]/) { ... } The =~ operator is Perl’s regular expression matching operator; this is used to invoke the comparison of the value of $Word and the /^[A-Z]/ pattern. If the current word is classified as a keyword, then the words before it are combined to form the start string, and the keyword and remaining words are combined to form an end string. These strings can then be combined to produce a line for the final output. This is achieved using the sprintf function (the same as that in C’s stdio library). The sprintf function creates a string in memory, returning this string as its result. Like printf, sprintf takes a format string and a list of arguments. The output lines shown can be produced using the statement:

$line = sprintf "%50s %-50s\n", $start, $end;

The complete program is:

#!/usr/bin/perl
   sub by_keystr {
   my $str1 = substr($a,50);
   my $str2 = substr($b,50);
   if($str1 lt $str2) { return -1; }
   elsif($str1 eq $str2) { return 0; }
   else { return 1; }
}
   @collection = ();
   while($title = ) {
   chomp($title);
   @Title = split / / , $title;
   $start = "";
   foreach $i (0 .. $#Title) {
   $Word = $Title[$i];
   if($Word =~ /^[A-Z]/) {
   $end = "";
   for($j=$i;$j<=$#Title;$j++)
   { $end .= $Title[$j] . " "; }
   $line =
   sprintf "%50s %-50s\n", $start, $end;
   push(@collection, $line);
   }
   $start .= $Word . " ";
   }
   }
   @sortcollection = sort by_keystr @collection;
   foreach $entry (@sortcollection) {
   print $entry;
 }

In Perl, there is always another way! Another way of building the $end list would use Perl’s join function:

$end = join ‘ ‘ $Title[$i .. $#Title];

Perl’s join function (documented in perlfunc) has two arguments – an expression and a list. It builds a string by joining the separate strings of the list, and the value of the expression is used as a separator element. Perl comes with libraries of several thousand subroutines; often the majority of your work can be done using existing routines. However, you will need to define your own subroutine – if simply to tidy up your code and avoid excessively large main-line programs. Perl routines are defined as:

sub name block

A routine has a return value; this is either the value of the last statement executed or a value specified in an explicit return statement. Arguments passed to a routine are combined into a single list – @_. Individual arguments may be isolated by indexing into this list, or by using a list literal as an lvalue. As illustrated with the sort helper function in the last part, subroutines can define their own local scope variables.Many more details of subroutines are given in the perlsub part of the documentation. Parentheses are completely optional in subroutine calls:

Process_data($arg1, $arg2, $arg3);
   is the same as
   Process_data $arg1, $arg2, $arg3;

A definition for such a routine is:

sub octal {
   my $str = $_[0];
   my $code = 0;
   for(my $i=1;$i<10;$i++) {
   $code *=2;
   $code++ if("-" ne substr($str,$i,1));
   }
   return $code;
   }

This subroutine could be invoked:

$str = "-rwxr-x---";
   $accesscode = octal $str;

For a second example, consider a subroutine to determine whether a particular string is present in a list:

member(item,list);

As noted above, the arguments for a routine are combined into a single list; they have to be split apart in the routine. The processing involves a foreach loop that checks whether the next list member equals the desired string:

sub member {
   my($entry,@list) = @_; # separate the arguments
   foreach $memb (@list) {
   if($memb eq $entry) { return 1; }
   }
return 0;
 }

Actually, there is another way. There is no need to invent a member subroutine because Perl already possesses a generalized version in its grep routine. grep match_criterion datalist When used in a list context, grep produces a sub-list with references to those members of datalist that satisfy the test. When used in a scalar context, grep returns the number of members of datalist that satisfy requirements.

legal notice

Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.

Useful tools and features

Link to this article from your page    Send this article to you or to a friend
If you like this article (tutorial), please link to it from your web page using the information above.

related articles

1. Employee who blogs
Internal employee blogs can be a fantastic catalyst within your company. Internal employee blogs help forge connections inside the company. External employee blogs are also great because they allow employees to connect with like-minded individuals outside the company. It doesn’t matter whether you have a dozen employees or 2000, having your staff members connect creates fantastic new opportunities especially if you’re using idea blogs, as these pairings ...

2. The vBulletin Administrator Experience
The vBulletin Administrator Experience What are the differences for an administrator compared to a regular member? Well, there are quite a few. We'll take a look at some of the more important ones now. Forum and Thread Tools The first differences are the forum and thread tools. Forum tools allow the administrator to view the posts and attachments that are in the moderator queue. (These are the posts and attachments that need to be approved before being made visible.) Th...

3. Generation of dynamic pages
Most of this text is concerned with elaborate ways of creating dynamic pages through Perl scripts, PHP scripts, Java servlets and Java Server Pages. The basic Apache setup provides support for CGI programs (based on Perl scripts and alternatives), and for the fairly limited ‘server-side includes’ (SSI) mechanism. The relevant modules (mod_env, mod_cgi and mod_include) are included in the default Apache build. It is best to limit the number of directories that contain executable code that can generate dynamic pages. The...

4. The next few elements define options
In this example, the defaults for htdocs and its subdirectories are set to allow clients to view the contents of a directory (as a page with a list of files, or something prettier), enable support for content negotiation, and permit the use of Unix inter-directory links. The next subdirective, AllowOverride, makes provision for overriding .htaccess files in subdirectories. The options here allow you to specify that nothing be changed (as in the example with AllowOverride None), or that anything be changed (AllowOverride Any...

5. Slightly modified specification for a CS1 program
The manager of a fast food outlet requires a program to help track sales. The outlet only serves burgers with fries; a burger meal costs $5.95. Customers may order any number of burger meals. The program is to help calculate prices of orders, and is also to keep records of total orders and the largest single order. The program is to use a simple menu-select style loop with the options: (1) Place order (2) Print totals so far (3) Quit The order option should result in a prompt for the number of meals ...

6. Lists and arrays
A few more features of Perl must be covered before any more interesting programs can be written. First, we need Perl’s ‘lists’ (or ‘arrays’). A Perl list is like a dynamic array class in C++ or Java (e.g. java.util.Vector). Lists do not use Perl’s object syntax, but a list is basically an object that owns data and which has an associated group of functions. A Perl list: Owns a collection of data elements (usually scalar values, but you can build lists of lists and other more complex struct...

7. Finding what matched and other advanced features
Sometimes, all that you need is to know is whether input text matched a pattern. More commonly, you want to further process the specific data that were matched. For example, you hope that data from your web form contain a valid credit card number – a sequence of 13 to 16 digits. You would not simply want to verify the occurrence of this pattern; what you would want to do is to extract the digit sequence that was matched, so that you could apply further verification checks. Regular expressions allow you to define group...