I am trying to sum up values based on duplicate’s found across “A-O” columns. Am using the below macro. There are around 500k+ records and the below macro hangs bad.
Sub Formulae(TargetCol1, TargetCol2, ConcatCol, Col1, Col2, StartRow, EndRow, Sheet)
Sheets(Sheet).Range(TargetCol1 & CStr(StartRow)).Formula = "=SUMIF($" & ConcatCol & "$" & CStr(StartRow) & ":$" & ConcatCol & "$" & CStr(EndRow) & "," & ConcatCol & CStr(StartRow) & ",$" & Col1 & "$" & CStr(StartRow) & ":$" & Col1 & "$" & CStr(EndRow) & ")"
Sheets(Sheet).Range(TargetCol1 & CStr(StartRow)).Select
Selection.Copy
Sheets(Sheet).Range(TargetCol1 & CStr(EndRow)).Select
Range(Selection, Selection.End(xlUp)).Select
Application.CutCopyMode = False
Selection.FillDown
Call PasteSpecial(TargetCol1, "T", StartRow, EndRow)
Sheets(Sheet).Range(TargetCol2 & CStr(StartRow)).Formula = "=SUMIF($" & ConcatCol & "$" & CStr(StartRow) & ":$" & ConcatCol & "$" & CStr(EndRow) & "," & ConcatCol & CStr(StartRow) & ",$" & Col2 & "$" & CStr(StartRow) & ":$" & Col2 & "$" & CStr(EndRow) & ")"
Sheets(Sheet).Range(TargetCol2 & CStr(StartRow)).Select
Selection.Copy
Sheets(Sheet).Range(TargetCol2 & CStr(EndRow)).Select
Range(Selection, Selection.End(xlUp)).Select
Application.CutCopyMode = False
Selection.FillDown
Call PasteSpecial(TargetCol2, "U", StartRow, EndRow)
End Sub
Sub PasteSpecial(Col1, Col2, StartRow, EndRow)
Range(Col1 & CStr(StartRow)).Select
Range(Selection, Selection.End(xlDown)).Select
Selection.Copy
Range(Col2 & CStr(StartRow)).Select
Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
:=False, Transpose:=False
End Sub
Let me explain the macro in short. I have Columns “A-O” and I have to group them…based on grouping I have to sum columns “P,Q”. I have a function that makes a concatenated string out of the 16 columns and stores in “AA” column. Based on this column I use the sumif function to sum all duplicate values
=SUMIF($AA$2:$AA$500000,$AA2,$P$2:$P$500000)
=SUMIF($AA$2:$AA$500000,$AA2,$Q$2:$Q$500000)
Then I copy paste special as ‘values’ the above values to remove the formula, in 2 new cols (pasteSpecial function in above macro code).
Finally I call the remove duplicates to remove the duplicate values
I have used the .removeduplicates method which seems to work pretty fast even on such a huge dataset. Is there any predefined function in excel which would even sum the values of the duplicates and then remove the duplicate entries?
Sub Remove_Duplicates_In_A_Range(StartRow, EndRow, Sheet, StartCol, EndCol, level)
Sheets(Sheet).Range(StartCol & CStr(StartRow) & ":" & EndCol & CStr(EndRow)).RemoveDuplicates Columns:=20, Header:=xlNo
End Sub
The above logic hangs bad eating all CPU resources and crashing badly…
Someone please optimize the above macro to make it work with 500k+ records. A performance of 1-2 mins max is acceptable.
Please help!!!
EDIT: By 500k+ records I mean A1:O500000. Am supposed to check for duplicates in this manner a combination of A1,B1,C1,D1,E1,F1,G1,H1,I1,J1,K1,L1,M1,N1,O1 with A2,B2,C2,D2,E2,F2,G2,H2,I2,J2,K2,L2,M2,N2,O2 and A3,B3,C3,D3,E3,F3,G3,H3,I3,J3,K3,L3,M3,N3,O3 and so on….till A500000,B500000 etc… .
In short am supposed to check the entire A1-O1 set matches with the entire A2-O2 or A3-O3 or….. A500k-O500k and so on
For every match between the entire A-O recordset I need to sum their respective P,Q columns . Say for example A1-O1 set matched with A2-O2 set then add P1,Q1 and P2,Q2 and store in P1,Q1 or something..
In either case, I need to retain each original recordset say,A1-O1 with the summed up values of its duplicates and its own in P1,Q1
I dont suppose we can attach a demo of the excel sheet here now, can we? 🙁
EDIT2:
Function for replicating sumif formula across all cells
Sub PreNettingBenefits(StartRow1, EndRow1, StartRow2, EndRow2, Col_Asset, Col_Liab, Src_Col_Asset, Src_Col_Liab, ConcatCol, Src_ConcatCol, level, Sheet2, Sheet1)
'=SUMIF(Sheet1!$AA$2:$AA$81336,Sheet2!AA2,Sheet1!$P$2:$P$81336)
Application.Calculation = xlCalculationAutomatic
Sheets(Sheet2).Range(Col_Asset & CStr(StartRow2)).Formula = "=SUMIF(" & Sheet1 & "!$" & Src_ConcatCol & "$" & CStr(StartRow1) & ":$" & Src_ConcatCol & "$" & CStr(EndRow1) & "," & Sheet2 & "!" & ConcatCol & CStr(StartRow2) & "," & Sheet1 & "!$" & Src_Col_Asset & "$" & CStr(StartRow1) & ":$" & Src_Col_Asset & "$" & CStr(EndRow1) & ")"
Sheets(Sheet2).Range(Col_Asset & CStr(StartRow2)).Select
Selection.Copy
MsgBox Sheets(Sheet2).Range(Col_Asset & CStr(EndRow2)).Address
Sheets(Sheet2).Range(Col_Asset & CStr(EndRow2)).Select
Range(Col_Asset & CStr(StartRow2) & ":" & Col_Asset & CStr(EndRow2)).Select
Application.CutCopyMode = False
Selection.FillDown
Sheets(Sheet2).Range(Col_Liab & CStr(StartRow2)).Formula = "=SUMIF(" & Sheet1 & "!$" & Src_ConcatCol & "$" & CStr(StartRow1) & ":$" & Src_ConcatCol & "$" & CStr(EndRow1) & "," & Sheet2 & "!" & ConcatCol & CStr(StartRow2) & "," & Sheet1 & "!$" & Src_Col_Liab & "$" & CStr(StartRow1) & ":$" & Src_Col_Liab & "$" & CStr(EndRow1) & ")"
Sheets(Sheet2).Range(Col_Liab & CStr(StartRow2)).Select
Selection.Copy
MsgBox Sheets(Sheet2).Range(Col_Liab & CStr(EndRow2)).Address
Sheets(Sheet2).Range(Col_Liab & CStr(EndRow2)).Select
Range(Col_Liab & CStr(StartRow2) & ":" & Col_Liab & CStr(EndRow2)).Select
Application.CutCopyMode = False
Selection.FillDown
Application.Calculation = xlCalculationManual
End Sub
It hangs pretty bad. Whts the problem in replicating the formula across 30k-40k rows. Could someone please optimise the code?
Something must be terribly wrong with how you are doing the adding of the duplicates. Since you were scant on details of the data you are working with, I don’t know if this is the same, but I populated A1:O33334 (over 500k cells) with a random number between 1 and 10,000.
Using a dictionary object (I am known for my love and over-use of it), I went through all of them and summed only the duplicate values and then slapped the unique list of elements into column A in sheet2.
Reasons why a dictionary might be the thing to use:
The dupe checking and addition, and copying the unique cells only takes 2 seconds. Here is the code for your reference.