Handling LARGE amounts of data in excel - Data Guru Needed?

Assuming the column fits in a single spreadsheet (less than a million rows for Excel 2007), I would take the following steps assuming the data is in column A and A1 has a title for the column: 1. Create a formula in B2 =LEN(A2). Then copy the formula down so every entry in Column A has this formula beside it.

This formula should give you the length of the cell in column A. 2. Sort by column B.

Go down to the point where the length is greater than 15. Delete all entries below that. At this point, you can also delete all the blank cells.

They will have a length of 0. 3. Create a formula in C2 =SEARCH("X",A2) and copy it down as in step 1.

If X is found, it will return a numerical value. If it is not found, #VALUE will appear. 4.Do a descending sort by column C.

Delete every entry with a numerical value. 5. Repeat 3 and 4 for Y, Z, B, and C.

I think the only way to do this within Excel itself would be to write some VBA code do it. VBA (or "Visual Basic for Applications") is a programming language built into all recent versions of Excel. However you probably don't want to learn VBA just for this job, but if it's worth money to you to get this done.

You could hire a freelance programmer to code this for you. It should be a simple and quick job for someone already experienced in VBA. If I was doing this myself, the way I'd go is to export the Excel spreadsheet to a CSV file, and then use some file processing utlitilies to filter out the lines you don't want.

Any set of Unix-like utilities has several tools that could easily do this. I have http://www.cygwin.com/ Cygwin handy for jobs like this. You can use the grep or sed programs within that to filter out text lines based on pattern matching of the kind you described.

If you've never used them before they'll take a little bit of time to learn, but a lot less than learning VBA I would think. Apart from Unix tools, there are also lots of scripting languages that could do the job, but if you don't already know them, they're not going to be worth learning just this one off simple task. However, if you have a friendly programmer to hand, you could ask them if they can knock you up a script in their favorte language.

With the specs given, it should be a simple task.

My first thought was exactly what lobo7922 said = a database is the best way to do this. Then I thought, well that's not what you originally asked, so why not do a little research. I believe that what you want can in fact be done.

I found this section on Excel 2007 in the help on Microsoft Office website about office.microsoft.com/en-us/excel/HP05200... Advanced Filters. This might give you the filter display you want. Then I found this other section on how to office.microsoft.com/en-us/excel/HA10244... copy visible cells only, instead of all cells including hidden ones.

I hope these lead you toward your answer. If you provide a sample data set and a sample result set you're looking for, I might be able to figure out the formula for you - but right now I'm on my way out to dinner for a friend's birthday.

Probably. I think you need to do some research into wild cards and the search/replace feature in Excel. This is where a knowledge of programming can come in handy.

Maybe a complex Macro might help too. But you'd have to learn how to write one. In Help type in Replace.

That should get you started a link below is an example of a result you might find for example something called PhraseSweeper. Don't know if it will help or not. Might be best to pay somebody online to do what you need maybe through something like Mahalo Tasks.

Assuming your not working with private information if so you'd probably want to make sure it was somebody reputable and probably have a contract. Keep in mind if you can't do it in excel there's nothing saying you can't find a tool or tools to do it with a csv or tabbed document which excel could save it into and later re-import. I might be willing to, for pay, to do this for you and I have the means to encrypt files and decrypt so your data would be safe while in transit across the web.

Either way good luck cause your going to be investing some time in this little project. Whatever you do backup a lot always keep your original expecially with large files which sometimes can crash MS programs.

I wouldn't recommend doing this in Excel, the besto option in case of real large amounts of data is using MySQL:mysql.com/Of course learning MySQL can't be done from one day to another. For advice and tips about Excel I recomend this webpage:mrexcel.com/For example there I found this webpage filled with free excel addins:cpearson.com/excel/xltools.htmGive it a look, could be useful. Just in case, also remember that Google has spreadsheets too:spreadsheets.google.com/Good luck :).

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions