UNIX Help
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsOperating SystemsUNIX Help

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old September 21st, 2006, 03:47 PM
jabs jabs is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2004
Posts: 19 jabs User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 51 m 47 sec
Reputation Power: 0
Sed search and replace

Hi all,

I want to write a sed or awk routine that will find the instance where a line feed and double quote are together in a line and replace them with just the " double quote. I can replace all line feeds, for one gigantic line of data, but that's no good either...I'm not having much success.

Here is my brilliant, errrr, not so successful code to date. http://forums.devshed.com/newthread.php?do=newthread&f=35#
eh?

sed -e 's/"\012/"/g' File1 > File2


TIA for any help.

joe.

Reply With Quote
  #2  
Old September 27th, 2006, 07:57 AM
stanleypane stanleypane is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2003
Posts: 210 stanleypane User rank is Sergeant (500 - 2000 Reputation Level)stanleypane User rank is Sergeant (500 - 2000 Reputation Level)stanleypane User rank is Sergeant (500 - 2000 Reputation Level)stanleypane User rank is Sergeant (500 - 2000 Reputation Level)stanleypane User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 4 Days 8 h 15 m 25 sec
Reputation Power: 26
Quote:
Originally Posted by jabs
Hi all,

I want to write a sed or awk routine that will find the instance where a line feed and double quote are together in a line and replace them with just the " double quote. I can replace all line feeds, for one gigantic line of data, but that's no good either...I'm not having much success.

Here is my brilliant, errrr, not so successful code to date. http://forums.devshed.com/newthread.php?do=newthread&f=35#
eh?

sed -e 's/"\012/"/g' File1 > File2


TIA for any help.

joe.


You'll want to use awk for this purpose. Sed doesn't doesn't handle newlines. Try this:

Code:
awk '/\"$/{printf "%s",$0; next}{print}' file1 > file2

Reply With Quote
  #3  
Old September 28th, 2006, 12:29 PM
jabs jabs is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2004
Posts: 19 jabs User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 51 m 47 sec
Reputation Power: 0
Quote:
Originally Posted by stanleypane
You'll want to use awk for this purpose. Sed doesn't doesn't handle newlines. Try this:

Code:
awk '/\"$/{printf "%s",$0; next}{print}' file1 > file2


Thanks for the help. Unfortuntely, it made the file one huge line.

Here is an example of the data I'm using:

1234|320|1|"Sample of data"
1234|321|2|"for DevShed"
these are good.

Here are the bad lines
9876|1000|1|"This sample
"
9876|1001|2|"show the bad rec
"
56574|1015|1|"Another bad one
"

For the good data, obviously, I'd like it to remain the same. For the bad, I'd like to move the " to the end of the preceding line (or add a " to the end of line 1 and delete line 2).

Thanks again for looking at this!

Joe.

Reply With Quote
  #4  
Old September 28th, 2006, 02:10 PM
stanleypane stanleypane is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2003
Posts: 210 stanleypane User rank is Sergeant (500 - 2000 Reputation Level)stanleypane User rank is Sergeant (500 - 2000 Reputation Level)stanleypane User rank is Sergeant (500 - 2000 Reputation Level)stanleypane User rank is Sergeant (500 - 2000 Reputation Level)stanleypane User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 4 Days 8 h 15 m 25 sec
Reputation Power: 26
Ahhhh... I understand your problem a little better now. Sorry for the confusion.

If you know that all offending lines always begin with a doube quote, then you can simply remove those lines via grep. Then, any lines that don't end with a double quote can have it added.

There's probably a billion ways to do this, but here goes two:
Code:
Method 1 - grep & sed

   grep "^[^\"]" file1 | sed -e "s/\([^\"]\)$/\1\"/g" > file2

Method 2 - grep & awk

   grep "^[^\"]" file1 | awk '/[^\"]$/{printf "%s\"\n",$0; next}{print}' > file2

Hope this helps!

Reply With Quote
  #5  
Old September 28th, 2006, 02:13 PM
Ehlanna's Avatar
Ehlanna Ehlanna is offline
Not a clue what to put ...
Dev Shed Novice (500 - 999 posts)
 
Join Date: Jan 2006
Location: in front of this keyboard
Posts: 879 Ehlanna User rank is Major (30000 - 40000 Reputation Level)Ehlanna User rank is Major (30000 - 40000 Reputation Level)Ehlanna User rank is Major (30000 - 40000 Reputation Level)Ehlanna User rank is Major (30000 - 40000 Reputation Level)Ehlanna User rank is Major (30000 - 40000 Reputation Level)Ehlanna User rank is Major (30000 - 40000 Reputation Level)Ehlanna User rank is Major (30000 - 40000 Reputation Level)Ehlanna User rank is Major (30000 - 40000 Reputation Level)Ehlanna User rank is Major (30000 - 40000 Reputation Level)Ehlanna User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 2 Weeks 3 Days 10 h 27 m 24 sec
Reputation Power: 332
Change it to %s\n in the printf to add a newline in.
__________________
According to Sod's Law, buttered toast lands butter side down, when dropped.
Per nature, cats always land on their feet.
So, what happens when you strap buttered toast to the back of a cat and throw it out a window?
.

Reply With Quote
  #6  
Old September 30th, 2006, 08:52 AM
jabs jabs is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2004
Posts: 19 jabs User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 51 m 47 sec
Reputation Power: 0
Quote:
Originally Posted by stanleypane
Ahhhh... I understand your problem a little better now. Sorry for the confusion.

If you know that all offending lines always begin with a doube quote, then you can simply remove those lines via grep. Then, any lines that don't end with a double quote can have it added.

There's probably a billion ways to do this, but here goes two:
Code:
Method 1 - grep & sed

   grep "^[^\"]" file1 | sed -e "s/\([^\"]\)$/\1\"/g" > file2

Method 2 - grep & awk

   grep "^[^\"]" file1 | awk '/[^\"]$/{printf "%s\"\n",$0; next}{print}' > file2

Hope this helps!


Yes - The grep/sed got me most of the way there. I just did a couple of manual edits and it worked fine. Thanks again for your help.

Reply With Quote
  #7  
Old September 30th, 2006, 08:55 AM
jabs jabs is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2004
Posts: 19 jabs User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 51 m 47 sec
Reputation Power: 0
Quote:
Originally Posted by Ehlanna
Change it to %s\n in the printf to add a newline in.


Thanks for your help! (I like that avatar too!)

Reply With Quote
  #8  
Old October 2nd, 2006, 10:33 AM
stanleypane stanleypane is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2003
Posts: 210 stanleypane User rank is Sergeant (500 - 2000 Reputation Level)stanleypane User rank is Sergeant (500 - 2000 Reputation Level)stanleypane User rank is Sergeant (500 - 2000 Reputation Level)stanleypane User rank is Sergeant (500 - 2000 Reputation Level)stanleypane User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 4 Days 8 h 15 m 25 sec
Reputation Power: 26
Quote:
Originally Posted by Ehlanna
Change it to %s\n in the printf to add a newline in.


I've got the \n in there. It just has a \" just before it. He was wanting to add a quote before the newline.

Reply With Quote
  #9  
Old October 3rd, 2006, 08:03 AM
ghostdog74 ghostdog74 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2006
Posts: 177 ghostdog74 User rank is Captain (20000 - 30000 Reputation Level)ghostdog74 User rank is Captain (20000 - 30000 Reputation Level)ghostdog74 User rank is Captain (20000 - 30000 Reputation Level)ghostdog74 User rank is Captain (20000 - 30000 Reputation Level)ghostdog74 User rank is Captain (20000 - 30000 Reputation Level)ghostdog74 User rank is Captain (20000 - 30000 Reputation Level)ghostdog74 User rank is Captain (20000 - 30000 Reputation Level)ghostdog74 User rank is Captain (20000 - 30000 Reputation Level)ghostdog74 User rank is Captain (20000 - 30000 Reputation Level) 
Time spent in forums: 2 Days 21 h 34 m 32 sec
Reputation Power: 233
Quote:
Originally Posted by jabs
Hi all,

I want to write a sed or awk routine that will find the instance where a line feed and double quote are together in a line and replace them with just the " double quote. I can replace all line feeds, for one gigantic line of data, but that's no good either...I'm not having much success.

Here is my brilliant, errrr, not so successful code to date. http://forums.devshed.com/newthread.php?do=newthread&f=35#
eh?

sed -e 's/"\012/"/g' File1 > File2


TIA for any help.

joe.


Here's a Python alternative, without regular expressions
Input:
1234|320|1|"Sample of data"
1234|321|2|"for DevShed"
9876|1000|1|"This sample
"
9876|1001|2|"show the bad rec
"
56574|1015|1|"Another bad one
"


Code:
>>> for lines in open("input.txt"):
... 	lines = lines.strip() #strip newlines
... 	if not lines == '"':
... 		if not lines.endswith('"'):
... 			print lines + '"'
... 		else:
... 			print lines
... 
1234|320|1|"Sample of data"
1234|321|2|"for DevShed"
9876|1000|1|"This sample"
9876|1001|2|"show the bad rec"
56574|1015|1|"Another bad one"

Reply With Quote
  #10  
Old October 4th, 2006, 08:30 AM
jabs jabs is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2004
Posts: 19 jabs User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 51 m 47 sec
Reputation Power: 0
Quote:
Originally Posted by ghostdog74
Here's a Python alternative, without regular expressions
Input:
1234|320|1|"Sample of data"
1234|321|2|"for DevShed"
9876|1000|1|"This sample
"
9876|1001|2|"show the bad rec
"
56574|1015|1|"Another bad one
"


Code:
>>> for lines in open("input.txt"):
... 	lines = lines.strip() #strip newlines
... 	if not lines == '"':
... 		if not lines.endswith('"'):
... 			print lines + '"'
... 		else:
... 			print lines
... 
1234|320|1|"Sample of data"
1234|321|2|"for DevShed"
9876|1000|1|"This sample"
9876|1001|2|"show the bad rec"
56574|1015|1|"Another bad one"


I've never used python, but I'll give this a try. Cheers!

Reply With Quote
  #11  
Old October 6th, 2006, 07:24 AM
Verletto Verletto is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: May 2006
Location: Sweden
Posts: 14 Verletto User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 5 h 56 m 46 sec
Reputation Power: 0
Quote:
Originally Posted by stanleypane
Code:
Method 1 - grep & sed

   grep "^[^\"]" file1 | sed -e "s/\([^\"]\)$/\1\"/g" > file2



Some of the chars can be ommited, here is a purged version:
Code:
grep -v '^"' file | sed 's/\([^"]$\)/\1"/' > file2


grep only supplies sed with one line at once therefore the 'g' option can be omitted.

That was just my 50 cents

Reply With Quote
Reply

Viewing: Dev Shed ForumsOperating SystemsUNIX Help > Sed search and replace

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap