UNIX Help
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsOperating SystemsUNIX Help

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old May 3rd, 2005, 07:14 AM
samb1 samb1 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2004
Posts: 67 samb1 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 Day 1 h 42 m 42 sec
Reputation Power: 4
AWK script problem

Guys Im using a pretty common script to count the frequency of the words in a txt file. I want to do something similar but with the results of an online tagger, the tagger counts the number of nouns and places a NN next to the noun. So basically i want to count all occurences of words that have NN proceeding them.
My code at present is:
{gsub(/[.,:;!?(){}]/, "")
for (i = 1; i <= NF; i++)
freq[$i]++
}

END {
for (word in freq)
printf "%s\t%d\n", word, freq[word] | sort
}

Can someone give me some advice as to what i need to ammend

Reply With Quote
  #2  
Old May 3rd, 2005, 08:32 AM
vgersh99 vgersh99 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2005
Posts: 47 vgersh99 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 5 Days 5 h 41 m 53 sec
Reputation Power: 4
Send a message via AIM to vgersh99 Send a message via MSN to vgersh99 Send a message via Yahoo to vgersh99
what's your sample input and a desired output?

Reply With Quote
  #3  
Old May 3rd, 2005, 08:46 AM
samb1 samb1 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2004
Posts: 67 samb1 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 Day 1 h 42 m 42 sec
Reputation Power: 4
An example of the text after it has been tagged is:

([ Mr._NNP Gray_NNP ]) "The word MR has a _NNP tag after it, so that is a noun"
<: said_VBD :>
([ it_PRP ])
<: would_MD begin_VB promptly_RB :>
at_IN three_CD ._. "_``

I need to count all instances of words that have the _NNP or _NN tag after them.
My output would just be the total of _NNP and _NN instances.

Thanks

Reply With Quote
  #4  
Old May 3rd, 2005, 03:43 PM
vgersh99 vgersh99 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2005
Posts: 47 vgersh99 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 5 Days 5 h 41 m 53 sec
Reputation Power: 4
Send a message via AIM to vgersh99 Send a message via MSN to vgersh99 Send a message via Yahoo to vgersh99
Quote:
Originally Posted by samb1
An example of the text after it has been tagged is:

([ Mr._NNP Gray_NNP ]) "The word MR has a _NNP tag after it, so that is a noun"
<: said_VBD :>
([ it_PRP ])
<: would_MD begin_VB promptly_RB :>
at_IN three_CD ._. "_``

I need to count all instances of words that have the _NNP or _NN tag after them.
My output would just be the total of _NNP and _NN instances.

Thanks


I assume for the above sample - the result should be 2.

nawk -f sam.awk myFile

here's sam.awk:
Code:
BEGIN {
  RE="[^ ](_NNP|_NN)"
}
{
  tot+=gsub(RE, "")
}

END {
   printf("Total [_NNP|_NN]: %d\n", tot)
}

Reply With Quote
Reply

Viewing: Dev Shed ForumsOperating SystemsUNIX Help > AWK script problem


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 2 hosted by Hostway