Saturday, 21 February 2015

Search in Random Access Memory

Image from 
Random access memory is simply an array of directly addressable elements like Bytes, Words, Longs, ... or even Strings and ... Each memory element is directly addressable by its distance from the first element, this is what we get used to it and picture it as arrays in programming languages. If you are a database programmer then you may simply show it by a two columns table, the first column as the primary key or address of the element and the second column as the content of the element.

So if you have an 8 bits address register, then you can address up to 2^8=256 elements, for 32 bits you can address up to 2^32=4G elements and for a 64 bits you can have up to 2^64=16 E (Exa) elements. It means that a typical 64 bit processor can address up to 16E Bytes of memory in theory, although they don't do it (I think most of them are using 48 bits at the moment).

Note that computers can address Words instead of Bytes, in facts the size of the memory elements in bit which is called word size can be 8, 16, 32, ... So in this case, the total memory capacity will be the number of addressable elements times size of the memory's word.

To read from or write into a memory element you need to have an address register capable of keeping the size of the address space, so for our 64 bit processor, we need a 64 bit address register to be able to address each of the memory element in random. We also need a word register which lets us read data addressed by address register or write it in the addressed place. The following pseudo codes are for what we talked about:

// AR is Address Register
// WR is Word Register
// R1 is just a register

// sample write to a random memory place
Move WR,(data to be stored)
Move AR,(address to be put word data in it)

// sample read from a random memory place
Move AR, (address to be put word data in it)
Move WR, R1

Or if you store your data in a database table with an incremental primary key then all you need is just execute something like the following queries:

// a sample read
select <element> from <table> 
  where <primary_key> = "element_address"

// a sample write
update <table> set <element> = "element_data" 
  where <primary_key> = "element_address"

Or if we want to implement it as an array, in Java we can have it like:

myRAM = new int[2048]; // 2048 is the size of address space
myRAM[102] = -200;     // write to a random address;
int i = myRAM[102];    // read from a random address;

Now how we can search in RAM and look for something we are interested in? At lowest level we may have something like this for each comparing:

Move R1, 334    // usually, at least, one operand must be a register
CMP  R1, 224    // the second can be a register or number or memory

IEQ  EqualLabel // after comparison you need to test a compare flag
                // bit and jump based on its value

If the content of the memory element is something like string or even records or anything larger than the processor's word size, then the compare gets complicated and needs some loops over all record  detail and perhaps another loop for each field of the record.

Now consider you want to find a match in RAM with size of 10^300 bytes!? This was the storage size of a 1000 bit Spars Distributed Memory gives us. You can't just start comparing from the beginning address of the RAM or divide RAM into some smaller part and do some parallel search for the memory element you are looking for! (you can but it doesn't give you the result you are looking for) Sorting also doesn't give you a good result, you may need to compare log2(10^300) ~ 1000 times for each memory element, it is still too much time consumer.

Why am I talking about these basic things? Because the methods we are using to store and retrieve dense data aren't useful for huge amount of memory, this is what we see in relational databases, or even in many modern big data storage and retrieval technologies, but how we human do this in our mind!?

It seems that mother nature has already had a plan for this problem and has known how to deal with the storage and searching in huge amount of memory we human keep during our life.