Salesforce Apex Batch Jobs
Those of you who are into development on Salesforce’s Force.com platform probably know what a batch job or batch apex is. Me too thought I knew a thing or two about this. Through my seven years of development on the platform, I have used the batch feature may be around twenty times. So I was confident about taking up a new requirement in my current project which wanted to merge and cleanse the lead records which have been there in the system for the last four years. Remember that the company which I work for is a leading product company across the world and also one of the first Salesforce customers apart from being a key customer for Salesforce. This means that I am going to deal with around 2.5 million lead records.
The requirement goes by this — the business team wants to close the garbage leads without an email address in them. Wow, that’s simple!
For the duplicate, leads develop a ranking algorithm and find out the survivor and victim lead among duplicates and then, close the victims. On the survivor lead, copy over the products the lead is interested from the closed victim leads to a related custom object. OK, I can manage that.
Next, create the leads product of interest in the same related custom object for non-duplicate leads as well — Cool, I can do that. Was I overconfident?
I wrote three batch jobs for these:
- This job would just query the leads which have garbage data and then mark them closed. This one also scans through the leads in each chunk of the batch, run the ranking algorithm and the find the survivor and victim. Since the duplicate leads can come in other chunks as well, I need to maintain the state of the batch using “Database.Stateful”. I was aware of the potential heap size limit which is 12 MB in the async job. In the finish method of the job, I utilized batch job chaining and invoked my second batch.
- This batch would get the id’s of survivor and victim from the first job run, query the database and return the results to execute method in chunks of 200 and then close the leads after creating the related custom object record. This was done to overcome the Salesforce governor limit that prevents insertion of more than 10000 records. In the finish method of this job, I called the third job.
- The purpose of this batch job was to create related records on the non-duplicate leads.
All these sounded like a plan to me and implemented this well. Tested with a couple of thousands of records in dev and staging environments and the job executed well, except when I ran this in production.
When I executed this job using the developer console, I rushed to “Apex Jobs” screen to see the progress. There were around 12000 batches. The job started its execution and I clicked on refresh after a minute, to see the total jobs has come down to 3000 and completed was 80. There were no failures. I was wondering why the total jobs were reduced. Refreshed once more to see that all three jobs were completed. The first job had just 87 batches, second had just 2 and third had only 60 batches. I was expecting around 12K batches, mind you!
I worked with Salesforce support team to figure that the issue was because the job hit the heap size after processing certain records. The only way to avoid this when I am running against this many sets of records was to avoid Database.Stateful, which obviously I cannot as I am scanning through the entire set of the database to find out duplicates. You never know in which chunk would the duplicate come. Hence the state needs to be maintained. But, I screwed up!
I modified my logic next two days — but I had this feeling that I am making it more complex by overthinking it. I just let it go for a day and the next day I sat on the office patio and gave it a thought. There you go! Why don’t I just sort the query by email id (which is used to identify duplicates)? I modified the code so that I get all duplicates in one go and I can figure out the victim or survivor using my ranking algorithm. I still would need to maintain the state of the batch. But this time I am doing it only to check if the first set of email ids in the chunk is a continuation from the previous one. After that is determined I release the memory (thereby the heap size) and do further processing.
I also avoided this poor idea of chaining the batches and passing collections through each of them. Finally, I came up with three batches again. But this time -
- This one checks for garbage records and closes them out there itself. Query in start method, DML in execute and nothing in the final method. The batch is processed in chunks of 200 which ensured that I won’t ever run to any governor limits.
- Figured out survivor and victim and created a related record based on products from victims on the survivor and closed out the victims. Again, the state was maintained and trust me with this logic, the heap didn’t even reach 100 KB
- This batch doesn’t have many changes and did create records for non-duplicate leads and survivor records.
I ran the batch job in production and yes, I became a phoenix. It was successful. The records were processed successfully despite the fact that the job ran for around 5–6 hours.
Please reach out to me in case you need help with the code or would like to see the code. Thank you for reading this and I hope this will help you someday.