I have been asked to find any discrepancies on 2 spreadsheets. These are lists of projects from 2 different departments, but any project we do must go through both departments. S/S #1 is over 2300 lines, and s/s # 2 is over 800 lines. The problem is that S/S #1 lists the projects as a 5-digit number and has many repeated lines (for different stages of the preliminaries). S/s # 2 lists only the projects we are working on. The projects are listed only once, using the same 5-digit number but adds both a prefix and a suffix.
I have already spent 2 days going through the data and find that I am still missing things. There MUST be an easier way to do this!
I have been going through the Access lessons, but thought I knew Excel well enough to make those lessons unnecessary. Is there a solution in there that might make it worthwhile to purchase the Excel lessons as well? If so, which one? I don't have time to go through them all. Please help!
Obviously, this data needs to be moved to Access, but that is a task for another day, as it will involve training the people who use these s/s to use Access. I kinda need a 'NOW' solution!
Thanks.
Kevin Yip
@Reply 4 years ago
I assume you want to automate the comparison. You can stay in Excel for now, and use Excel VBA to loop through all the cells and columns of the two worksheets to find the differences. If the two worksheets have similar structures, the comparison should be a little easier. But it all depends on the nature of the data. Are the data sorted, sorted in the same way, or unsorted? What kinds of differences are there: additions and/or deletions to rows, columns, worksheets, etc., changes in cell values, etc. It could be a tall task to write VBA for a comprehensive comparison that covers all possible differences. And what kind of results do you want your comparison to give to the user? That's not straightforward either. For instance, in a simple scenario like this:
If Worksheet 1 has:
row 1: John, D., 2/1/1950, M
row 2: Jane, D., 3/4/1960, F
row 3: Jay, D. 4/5/1970, M
row 4: Juan, D., 5/6/1980, M
row 5: Jesus, D., 6/7/1990, M
And Worksheet 2 has:
row 1: John, D., 2/1/1950, M
row 2: June, D. 7/8/2000, F
row 3: Jane, D., 3/4/1960, F
row 4: Jay, D. 4/5/1970, M
row 5: Juan, D., 5/6/1980, M
row 6: Jesus, D., 6/7/1990, M
Should the result tell you row 2 to 6 have all been changed (which they have, if you compare row by row)? Or should it give you a sensible result that is much more helpful: only row 2 has been added, no other changes? The latter could be tough to do with programming, because certain rows are moved but unchanged.
Another way is to import everything to Access, differences and all. Then let Access perform "aggregate" queries on them. Rows that are identical would be combined, and those that are different would remain separate. I believe Richard has a video on that very topic but I can't remember what it is.
Faith HarveyOP
@Reply 4 years ago
Thank you for your suggestion. I'm not that au fait with Macros, to be honest, although I'd love to learn.
Taking your example, it's more like this:
Worksheet 1 has:
row 1: John, D., 2/1/1950, M
row 2: Jane, D., 3/4/1960, F
row 3: Jay, D. 4/5/1970, M
row 4: Juan, D., 5/6/1980, M
row 5: Jesus, D., 6/7/1990, M
row 6: John, D., 2/1/1950, M
row 7: Jane, D., 3/4/1960, F
row 8: Jay, D. 4/5/1970, M
row 9: Juan, D., 5/6/1980, M
row 10: John, D., 2/1/1950, M
row 11: Jane, D., 3/4/1960, F
row 12: Jay, D. 4/5/1970, M
And Worksheet 2 has:
row 1: Mr John G, D., 2/1/1950, M
row 2: Mrs June A, D. 7/8/2000, F
row 3: Mrs Jane B, D., 3/4/1960, F
row 4: Mr Jay P, D. 4/5/1970, M
row 5: Mr Jesus R, D., 6/7/1990, M
I need to isolate those people who are not on both lists (I have added a prefix and a suffix to the names in one list, and there is one item in each list that is not on the other list but still may not be unique).
My lists are sorted, but with a total of about 3000 lines, that hasn't helped me much. So far, I've been using the filter function and highlighting each line according to whether or not it is on both lists. Kill me now...
Kevin Yip
@Reply 4 years ago
If you just need to identify people who are not on both lists, you can perform a trick. Add a dummy column to both lists. On Sheet 1, the column contains "1" in every row. On Sheet 2, the column contains "2" on every row. Create a blank sheet, and copy Sheet 1 to it, and then copy and append Sheet 2 to the bottom of Sheet 1. I assume both sheets have matching column structure, so both dummy columns should now be on the same column. Create a Pivot Table that sums the new dummy column. The rows with a sum of 3 are the ones in both sheets, and those whose sum is less than 3 are not.
Sorry, only students may add comments.
Click here for more
information on how you can set up an account.
If you are a Visitor, go ahead and post your reply as a
new comment, and we'll move it here for you
once it's approved. Be sure to use the same name and email address.
This thread is now CLOSED. If you wish to comment, start a NEW discussion in
Excel Forum.