Perl Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming LanguagesPerl Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old August 18th, 2012, 07:23 AM
slurch901 slurch901 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2006
Posts: 2 slurch901 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 54 m 29 sec
Reputation Power: 0
Special character search

Hi all,

I was wondering if anyone could help me search for special characters from a document e.g /, *, ^.....
So far I can only search letters or words.

The code requires input from terminal e.g perl search <file1> <file2> ....
It returns the ammount of times that string has been used, what line and prints that line.

Here is my code so far:

Code:
#!/usr/bin/perl
print "Please enter search string:";
chomp($input=<STDIN>); 
while ($n <= $#ARGV) {
	$file = @ARGV[$n];
	open(txt, $file);
	print "\n$file contains:\n";
	while($line = <txt>) {
		$linenum++;
		if ($line =~ (/$input/i)) {
		print "Line:$linenum, $line";
			while ($line =~ (/$input/g)) {
				$found++;
			}
		}
	}
print "\nIt was found $found times.\n";
$linenum = 0;
$found = 0;
$n++;
}
close(txt);



Thanks alot.

Reply With Quote
  #2  
Old August 18th, 2012, 08:10 AM
keath's Avatar
keath keath is offline
!~ /m$/
Dev Shed Specialist (4000 - 4499 posts)
 
Join Date: May 2004
Location: Reno, NV
Posts: 4,084 keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level) 
Time spent in forums: 2 Weeks 4 Days 6 h 43 m 12 sec
Reputation Power: 1809
You need to escape the $input variable in the regex so that any special characters are treated as a literal search instead of a regex directive. You do that with \Q and \E.

Code:
#!/usr/bin/perl
use strict;
use warnings;

print "Please enter search string: ";
my $input;
chomp($input = <STDIN>);

foreach my $file (@ARGV) {
	my $found;
	open my $fh, "<", $file or die "Unable to open $file: $!";

	print "\n$file contains:\n";
	while(<$fh>) {
		if (/\Q$input\E/i) {
			print "Line $.: $_";
			$found++ while /\Q$input\E/g;
		}
	}
	print "\n'$input'  found $found times.\n";
}


I've done a few other things here which could be helpful.

First, use strict and warnings at the top. Very important. Always use them. Strict mode might be confusing at first, but it just means that you have to declare your variables. You do that my using the 'my' keyword the first time a variable is used in that scope. After that, perl will make sure you don't make typos or change the variable name later in the script.

Check for failure when you open files. A user can easily enter a bad filename, or fail to provide the complete path.

You don't have to use special variables to keep track of the line number or contents of the line in a file. $. is the line number. $_ is the line itself in this context, though it is perfectly fine to use your own variable name. If $line is more clear to you, it's good.

An example of proper scope for variables: notice that my $found is declared inside the foreach loop. At the end of the loop I don't have to reset $found to zero. That is done automatically when the end of loop is reached. A new $found variable is created at the top of the loop next iteration. Same for the $fh (file handle).
Comments on this post
Laurent_R agrees!

Last edited by keath : August 18th, 2012 at 08:12 AM.

Reply With Quote
  #3  
Old August 18th, 2012, 08:30 AM
Laurent_R Laurent_R is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jun 2012
Posts: 502 Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 4 Days 18 h 50 m 40 sec
Reputation Power: 385
Hi,

the following characters: "+ ? . * ˆ $ ( ) [ ] { } | \" have a special meaning in regular expressions and therefore need to be escaped (i.e. preceded by the escape character, "\") if you need to use their literal value. For example, if you are looking for the + character, your search should be for the string "\+". To look the the escape character ("\"), search the string "\\". Etc.

Either your user will have to enter this escape character, or you can build a function that will rework the user's input to add this escape character before any character belonging to the list above.

A couple of comments about your code. The $linenum variable is useless, the built-in $. special variable contains at any time the line number of the file being read. The most inner while loop seems useless to me. Unless I miss something, you could have just:

Perl Code:
Original - Perl Code
  1.     while($line = <txt>) {
  2.         if ($line =~ (/$input/i)) {
  3.             print "Line: $., $line";
  4.             $found++;
  5.         }
  6.     }


The other thing is that the part:

Perl Code:
Original - Perl Code
  1. while ($n <= $#ARGV) {
  2.     $file = @ARGV[$n];
  3.     open(txt, $file);


is not optimal. First, you should always check the return status of an "open" statement. Second, it would be better to use each of the values of @ARGV directly with a foreach statement, rather than using these somewhat clumsy $n and $#ARGV variables:

Perl Code:
Original - Perl Code
  1. foreach my $file  (@ARGV) {
  2.     open my $FILE_IN, "<", $file or die "could not open $file $! \n";


EDIT: just when I was about to post this message, I was interrupted by a long phone call by someone from a charity asking for a donation. I had not seen Keath's answer when I made mine. And BTW, Keath, I did not know about these \Q and \E tags, it must be something new, I'll look it up, as it seems fairly handy.

Last edited by Laurent_R : August 18th, 2012 at 08:38 AM.

Reply With Quote
  #4  
Old August 18th, 2012, 08:40 AM
FishMonger FishMonger is offline
Contributing User
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Apr 2009
Posts: 1,645 FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 1 Day 21 h 26 m 36 sec
Reputation Power: 1170
Hmm,

Which of you three is going to get the grade for doing the homework assignment?

:/

Reply With Quote
  #5  
Old August 18th, 2012, 09:10 AM
slurch901 slurch901 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2006
Posts: 2 slurch901 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 54 m 29 sec
Reputation Power: 0
I have modified it now and its working thanks to your help, and Fishmonger what are you talking about? This is just for practise, im not doing any sort of perl course.

Reply With Quote
  #6  
Old August 18th, 2012, 09:11 AM
keath's Avatar
keath keath is offline
!~ /m$/
Dev Shed Specialist (4000 - 4499 posts)
 
Join Date: May 2004
Location: Reno, NV
Posts: 4,084 keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level)keath User rank is General 12nd Grade (Above 100000 Reputation Level) 
Time spent in forums: 2 Weeks 4 Days 6 h 43 m 12 sec
Reputation Power: 1809
First time poster. Provides working code. Doesn't know how to escape regex.

I don't see any abuse of the forum. Seems totally fair, and I'm happy to make the minor effort.

Reply With Quote
  #7  
Old August 18th, 2012, 09:17 AM
Laurent_R Laurent_R is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jun 2012
Posts: 502 Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 4 Days 18 h 50 m 40 sec
Reputation Power: 385
Quote:
Originally Posted by FishMonger
Hmm,

Which of you three is going to get the grade for doing the homework assignment?

:/


This may be homework (or maybe not), but slurch901 has done reasonable work to produce something that more or less works, so why not help her or him on these special characters? And why not giving a couple of advice to improve the code?

@Keath: I have looked the \Q and the \E quote metas, which I did not know, thank you, this will be useful to me.

Reply With Quote
  #8  
Old August 18th, 2012, 09:38 AM
FishMonger FishMonger is offline
Contributing User
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Apr 2009
Posts: 1,645 FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level)FishMonger User rank is General 3rd Grade (Above 100000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 1 Day 21 h 26 m 36 sec
Reputation Power: 1170
I might be getting just a little cynical in my old age.

Reply With Quote
  #9  
Old August 18th, 2012, 01:01 PM
Laurent_R Laurent_R is offline
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jun 2012
Posts: 502 Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level)Laurent_R User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 4 Days 18 h 50 m 40 sec
Reputation Power: 385
BTW, as an additional comment on your code, the index function would probably better than a regex for what you are trying to do (looking for an exact match, not for a pattern): on the one hand, it is faster, and, on the other hand, it will not fail on special characters (at least most of them).

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPerl Programming > Special character search

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap