A simpler solution to parse your string would be to use the function REGEXP to find the indices where you have 2 or more whitespace characters in a row, use these indices to break your string up into a cell array of strings using the function MAT2CELL then use the function STRTRIM to remove leading and trailing whitespace from each substring. For example.

Hi all and thanks in advance. This is my first post here, please let me know if I should do this differently. I have a large textfile containing lines like the following: "DATE TIMESTAMP T W M T AL M C A_B_C" At first I read this in using the fopen and fget1 commands, so that I get a string: Readout = DATE TIMESTAMP T W M T AL M C A_B_C I want to transform this via e.g. Textscan.

While I feel I know matlab I am by no means expert with this command and have trouble using it. I want to get: A = 'Date' 'TIMESTAMP' 'T W M' 'T AL M C' 'A_B_C' However using the following code: A = textscan(Readout,'%s'); A = A{1}'; I get: A = 'DATE' 'TIMESTAMP' 'T' 'W' 'M' 'T' 'AL' 'M' 'C' 'A_B_C' As I asked in the title, is there a way to ignore the single spaces? PS: At the end of writing this I just came up with a not very elegent solution I would still like to know if there is any nicer solution, however: ReadBetter = ; for n = 1:length(Read)-1 if Read(n) == ' ' & Read(n+1) ~= ' ' else ReadBetter = ReadBetter Read(n); end end ReadBetter = ReadBetter Read(n+1); Read ReadBetter Output: Read = DATE TIMESTAMP T W M T AL M C A_B_C ReadBetter = DATE TIMESTAMP TWM TALMC A_B_C Now I can use ReadBetter with textscan.

A simpler solution to parse your string would be to use the function REGEXP to find the indices where you have 2 or more whitespace characters in a row, use these indices to break your string up into a cell array of strings using the function MAT2CELL, then use the function STRTRIM to remove leading and trailing whitespace from each substring. For example: >> str = 'DATE TIMESTAMP T W M T AL M C A_B_C'; >> cutPoints = regexp(str,'\s{2,}'); >> cellArr = mat2cell(str,1,diff(0 cutPoints numel(str))); >> cellArr = strtrim(cellArr) cellArr = 'DATE' 'TIMESTAMP' 'T W M' 'T AL M C' 'A_B_C.

Newer versions of matlab have a 'split' option for regexp similar to perl's split. >> str = 'DATE TIMESTAMP T W M T AL M C A_B_C'; >> out = regexp(str, ' +', 'split') out = 'DATE' 'TIMESTAMP' 'T W M' 'T AL M C' 'A_B_C.

I think that you are making things too complicated. Just use: fid = fopen('pathandnameoffile'); textscan(fid,'%s','Delimiter','\t'); The example above assumes that you have tabs as delimiters. Change it to something else if required.

Thanks, but the files I have to open are strange (at least to me). All simple ways to open the files have failed on me, so that I ended up reading in line by line and then getting the problem described above. Maybe they are somehow corrupt, or their huge size (80Mb each) gives matlab a headache.

– Birk Birk Jun 30 '11 at 7:59

Here's one way to read your file: file. Dat DATE TIMESTAMP T W M T AL M C A_B_C DATE TIMESTAMP T W M T AL M C A_B_C DATE TIMESTAMP T W M T AL M C A_B_C DATE TIMESTAMP T W M T AL M C A_B_C DATE TIMESTAMP T W M T AL M C A_B_C DATE TIMESTAMP T W M T AL M C A_B_C MATLAB code: fid = fopen('file. Dat', 'rt'); C = textscan(fid, '%s %s %c%c%c %c%2c%c%c %s'); fclose(fid); C = C{1}, C{2}, ... cellstr( strcat(C{3},{' '},C{4},{' '},C{5}) ), ... cellstr( strcat(C{6},{' '},C{7},{' '},C{8},{' '},C{9}) ), ... C{10} The resulting cell-array: C = 'DATE' 'TIMESTAMP' 'T W M' 'T AL M C' 'A_B_C' 'DATE' 'TIMESTAMP' 'T W M' 'T AL M C' 'A_B_C' 'DATE' 'TIMESTAMP' 'T W M' 'T AL M C' 'A_B_C' 'DATE' 'TIMESTAMP' 'T W M' 'T AL M C' 'A_B_C' 'DATE' 'TIMESTAMP' 'T W M' 'T AL M C' 'A_B_C' 'DATE' 'TIMESTAMP' 'T W M' 'T AL M C' 'A_B_C.

