A Cadence DRAM Memory Controller IP customer asks, "I have a DRAM subsystem with ECC and my system has the capability to use write data masks and partial-word writes. DDR3 has a reset pin, why can't I just reset it? Why do I need to initialize the memory?"
The answer is "yes, you must initialize it" but the reason may be surprising to many people: DRAM contents are not lost when the power is turned off! I stumbled upon this great research paper from Princeton University this week which is interesting in itself (it talks about how most encryption is vulnerable to hacking through the DRAM), but also has some interesting data about just how long data can persist in DRAM. The researchers found that some of the bits in DRAM were still capable of holding charge minutes after losing power, even when the memory is removed from the machine entirely. A video clip located here shows the process they used, and at one point shows them removing a DIMM from one machine and putting it into another - and then retrieving all the data out of that DIMM!
For those interested in information technology security, this data suggests the importance of encrypting the contents of DRAM as well as the contents of a hard drive - but that was not the concern of the customer who asked the original question. When using ECC (Error Correcting Codes) in DRAM, a typical arrangement is to have 64 bits of data and 8 extra ECC bits that hold a SECDED code that is capable of correcting a one-bit error and detecting a 2-bit error in the 64 bits of data (Cadence's DRAM controller allows other sizes like 32&7, 32&4, 16&2 but let's stick with 64&8 for now).
The problem arises when the system needs to write less than the full 64 bits of data, and the memory controller needs to do a Read-Modify-Write (RMW) operation on the memory location to be able to preserve the part of the write data that was previously in that memory location that is not being overwritten by the current write operation. If the 64 bits of data that are being partially overwritten have stale and partially-degraded memory contents from the previous time the DRAM was used (for example, if the machine was turned off momentarily and then turned back on again) then when the memory controller tries to read that memory location it will encounter ECC errors when it tries to do the read portion of the RMW operation.
Wait a minute, don't newer DRAMs like DDR3, DDR4 and LPDDR2 have a reset pin?
Yes they do - but that reset only resets the memory state machines; it is not guaranteed to reset (or not reset) the memory contents.
Umm... okay, so what do I do about it?
I'm glad you asked! The simplest thing is never to do masked or partial word writes - then any time you might use a memory location that had old data in it, you will overwrite it completely. You system will still have lots of errors, though, if you happen to read a location in memory that has not been written to yet. This solution is impractical for systems working with short and irregular data packets like networking and video.
In simulation, you can use advanced properties of your Verification IP (VIP) such as Cadence's VIP Catalog Memory Models (formerly known as Denali MMAV) to set a pre-assigned value into all the DRAM when you start simulations, so that you don't have to initialize the DRAM in simulation every time. Just be sure to do your final signoff on a memory with randomly assigned background data and do take care to initialize it.
For your real system, you can write a program for your CPU that writes to every DRAM location, although this could take a while. Cadence's DRAM Memory Controller IP has a BIST option that will run a hardware test on the DRAM as well as leave the DRAM's ECC check bits in a correctly calculated state and which will run significantly faster than in software.
Now... how do I encrypt that DRAM?
Maybe a topic for another blog...