Remove trailing whitespace recursively only at end of file using grep/sed?

Awk '{if (flag) print line; line = $0; flag = 1} END {gsub(":space:+$","",line); printf line}' Edit: New version: The sed command removes all the trailing lines that consist of only whitespace then the awk command removes the ending newline. Sed '/^:space:*$/{:a;$d;N;/\n:space:*$/ba}' inputfile | awk '{if (flag) print line; line = $0; flag = 1} END {printf line}' The disadvantage is that it reads the file twice. Edit 2: Here's an all-awk solution that only reads the file once.It accumulates white-space-only lines in a manner similar to the sed command above.

#! /usr/bin/awk -f # accumulate a run of white-space-only lines so they can be printed or discarded /^:space:*$/ { accumlines = accumlines nl $0 nl = "\n" accum = 1 next } # print the previous line and any accumulated lines, store the current line for the next pass { if (flag) print line if (accum) { print accumlines; accum = 0 } accumlines = nl = "" line = $0 flag = 1 } # print the last line without a trailing newline after removing all trailing whitespace # the resulting output could be null (nothing rather than 0x00) # note that we're not print the accumulated lines since they're part of the # trailing white-space we're trying to get rid of END { gsub(":space:+$","",line) printf line } Edit 3: removed unnecessary BEGIN clause changed lines to accumlines so it's easier to distinguish from line (singular) added comments.

– j_random_hacker Jan 18 '11 at 18:09 Not very familiar with awk but I think this does handle the multiple-trailing-blank-lines case incorrectly. -1 for now. – j_random_hacker Jan 18 '11 at 18:34 @j_random_hacker: Only the last line.

See my edit. – Dennis Williamson Jan 18 '11 at 18:50 OK that last solution is correct I believe, -1 reverted. Can't quite bring myself to +1 it though... – j_random_hacker Jan 18 '11 at 20:28.

This will strip all trailing whitespace: perl -e '$s = ""; while (defined($_ = getc)) { if (/\s/) { $s . = $_; } else { print $s, $_; $s = ""; } }' outfile There's probably an equivalent in sed but I'm much more familiar with Perl, hope that works for you. Basic idea: if the next character is whitespace, save it; otherwise, print any saved characters followed by the character just read.

If we hit EOF after reading one or more whitespace characters, they won't be printed. This will simply detect trailing whitespace, giving an exit code of 1 if so: perl -e 'while (defined($_ = getc)) { $last = $_; } exit($last =~ /\s/);' outfile EDIT The above describes how to detect or change a single file. If you have a large directory tree containing files that you want to apply the changes to, you can put the command in a separate script: fix.Pl #!

/usr/bin/perl $s = ""; while (defined($_ = getc)) { if (/\s/) { $s . = $_; } else { print $s, $_; $s = ""; } } and use it in conjunction with the find command: find /top/dir -type f -exec sh -c 'mv "{}" "{}. Bak" && fix.

Pl "{}"' ';' This will move each original file to a backup file ending in ". Bak".(It would be a good idea to test this on a small test fileset first. ).

By the way, I don't think there's any way sed can do it. – Dennis Williamson Jan 18 '11 at 18:05 @Dennis: Yes. All reads will be buffered by the OS so it won't be terribly slow.

(Although it seems likely that the file consists of short, text lines in this case, reading a line at a time risks bad performance and high memory usage on files with very long lines or binary files that may contain few \n characters. ) – j_random_hacker Jan 18 '11 at 18:08 There probably isn't an equivalent in sed; it deals in lines and therefore emits a newline. – Jonathan Leffler Jan 18 '11 at 18:08 Uh oh, I fear I didn't explain my problem accurately enough....those 1,500 files that I have are in directories and subdirectories, can your code be used to start in a given directory and scan each subdir?

– MALON Jan 18 '11 at 18:10 @MALON: Use find for that. Man find – j_random_hacker Jan 18 '11 at 18:12.

A Perl solution: # command-line arguments are the names of the files to check. # output is names of files that end with trailing whitespace for (@ARGV) { open F, ' =~ /\s/ }.

So what about the rest of the lines, if I interpret your code correctly. – ghostdog74 Jan 19 '11 at 2:27 Oh, I interpreted the question as only worrying about the end of the file, not the end of each line. – mob Jan 19 '11 at 2:30 @ghostdog74: mob's interpretation is correct -- check the OP's question.

– j_random_hacker Jan 26 '11 at 4:54.

Ruby -e 's=ARGF. Read;s. Rstrip!

;print s' file basically, read the whole file, strip the last whitespace if any, and print out the contents. So this solution is not for VERY huge files.

– j_random_hacker Jan 19 '11 at 2:18 yes. I probably should have mentioned its not for huge files – ghostdog74 Jan 19 '11 at 2:24.

You may also use man ed to delete trailing white space at file end and man dd to delete a final newline (although keep in mind that ed reads the whole file into memory and performs an in-place edit without any kind of previous backup): # tested on Mac OS X using Bash while IFS= read -r -d $'\0' file; do # remove white space at end of (non-empty) file # note: ed will append final newline if missing printf '%s\n' H '$g/:space:\{1,\}$/s///g' wq | ed -s "${file}" printf "" | dd of="${file}" seek=$(($(stat -f "%z" "${file}") - 1)) bs=1 count=1 #printf "" | dd of="${file}" seek=$(($(wc -c.

Might be easier reading the file from the bottom to the top: tac filename | awk ' /^:space:*$/ &&! Seen {next} /^:space:/ &&! Seen {gsub(/:space:+$/,""); seen=1} seen ' | tac.

Creative :) But the line #4 starts /^: when it should be /^:, and I believe the 1st argument to gsub should be /^:space:+/. Can't +1 as this will be pretty inefficient -- the 2nd tac takes its input from a pipe and so has to wait for the entire result of awk to be saved before it can start. – j_random_hacker Jan 18 '11 at 20:40 1 @j_random_hacker: No, the caret indicates "not" in that line.In the line before it, it indicates "beginning of line".

The gsub is looking at whitespace at the end of the line rather than lines that consist only of whitespace. There is a problem, however. The script uses AWK's default output, print, so a newline is appended.

Also, the file contents are processed three times. – Dennis Williamson Jan 18 '11 at 20:57 @Dennis: I see now that /^: is (surprisingly enough!) the right thing, thanks, but I think you're wrong about the gsub part -- remember, we're still dealing with reversed input at this point, so we should be stripping whitespace from the beginning of the line. – j_random_hacker Jan 18 '11 at 21:05 1 @j_random_hacker: tac reverses top to bottom, rev reverses left to right.

So $ is correct. – Dennis Williamson Jan 18 '11 at 21:23 @Dennis: Ah I see. Thanks!

– j_random_hacker Jan 18 '117 at 2:16.

Just for fun, here's a plain C answer: #include #include #include int main(int argc, char **argv) { int c, bufsize = 100, ns = 0; char *buf = malloc(bufsize); while ((c = getchar())! = EOF) { if (isspace(c)) { if (ns == bufsize) buf = realloc(buf, bufsize *= 2); bufns++ = c; } else { fwrite(buf, 1, ns, stdout); ns = 0; putchar(c); } } free(buf); return 0; } Not much longer than Dennis's awk solution, and, dare I say, it, easier to understand! :-P.

Using man dd without man ed: while IFS= read -r -d $'\0' file; do filesize="$(wc -c.

Version 2. Linux syntax. Proper command.

Find /directory/you/want -type f | \ xargs --verbose -L 1 sed -n --in-place -r \ ':loop;/^:space:\t/ {p;b;}; N;b loop;' Version 1. Remove whitespace at the end of each line. FreeBSD syntax.

Find /directory/that/holds/your/files -type f | xargs -L 1 sed -i '' -E 's/: :+$//' where the white space in : : actually consists of one space and one tab characters. With space it's easy. You just hit the space button.In order to get tab character inserted press Ctrl-V and then Tab in the shell.

This only trims whitespace from the end of each line -- the asker wants to trim all whitespace characters from the end of the file (which may span 1 or more lines). – j_random_hacker Jan 18 '11 at 20:21 Also sed takes a -e option, not -E. – j_random_hacker Jan 18 '11 at 20:30 2 @j_random_hacker: OS X (and BSD) sed accepts -E for extended regular expressions (GNU sed uses -r).

– Dennis Williamson Jan 18 '11 at 20:37 Indeed. I didn't notice about the end of the file. I'll try to rewrite tomorrow.

– akond Jan 18 '11 at 22:57 So I fixed it. Should work now. – akond Jan 18 '117 at 8:59.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Remove trailing whitespace recursively only at end of file using grep/sed?

Related Questions

There many flavors of regular expression syntax (grep, egrep, awk, sed, vim, perl, pure posix, etc). Which one should I use for FoxyProxy?

How can Perl split a line on whitespace except when the whitespace is in doublequotes?

NetBeans Removing Trailing Whitespace on Save and Tabs to Spaces?

Nasty redirect loop in WordPress (trailing slash, no trailing slash, and so on)?

Remove all the lines from file starting from “{” to “}” using sed?

Whitespace at end of file causing EOF check to fail in C?