Creating and Using RefId Values


The RefId is one of the pil­lars of the SIF spec­i­fi­ca­tion — its under­stand­ing and prop­er use will make the dif­fer­ence between a suc­cess­ful and effi­cient imple­men­ta­tion and one that will be a con­tin­u­al source of prob­lems and incom­pat­i­bil­i­ties.

RefId val­ues are used to unique­ly iden­ti­fy objects. Almost all SIF objects have their own RefId whose val­ue is used to unique­ly and per­ma­nent­ly iden­ti­fy the object they rep­re­sent.


When the SIF object is estab­lished, the val­ue of the RefId should be cre­at­ed and it should remain con­stant as long as that SIF object rep­re­sents that object.
So, for exam­ple, when a Stu­dent­Per­son­al object is cre­at­ed to hold stu­dent demo­graph­ic and oth­er per­son­al infor­ma­tion, the cor­re­spond­ing RefId val­ue should be assigned and should nev­er be changed as long as the stu­dent remains enrolled. Oth­er rules (please don't be offend­ed if they appear too sim­ple; they rep­re­sent real prob­lems)

  • Nev­er send a Delete or a Change before an Add for a giv­en RefId (send mes­sages in the prop­er order)
  • Nev­er send an Add or a Change after a Delete for a giv­en RefId (do not reuse RefId val­ues)
  • RefId val­ues for a giv­en object should nev­er change - even between school years. An excep­tion to this would be if a stu­dent moves from one school to anoth­er and his or her records are man­aged by dif­fer­ent sys­tems, then his or her RefId val­ues would be dif­fer­ent in the new sys­tem. But, if the same sys­tem still man­ages his or her records, then the RefId val­ues should remain the same.

The Assigned Value

The val­ue of a RefId should be 32 hexa­dec­i­mal char­ac­ters (0−9 and the let­ters A-F), no dash­es or spaces. This val­ue is also some­times referred to as a GUID.
The val­ue assigned to a RefId should be “glob­al­ly unique”. The best way to do this is to use a func­tion that gen­er­ates a GUID (oper­at­ing sys­tems and the SQL lan­guage have built-in func­tions for gen­er­at­ing these). For exam­ple, the fol­low­ing val­ue may be giv­en as a default val­ue to a RefId field in a SQL Serv­er data­base to have them auto­mat­i­cal­ly gen­er­at­ed:


The SQL Serv­er NEWID() func­tion gen­er­ates GUID val­ues with dash­es in them; the REPLACE() func­tion removes them, and by sup­ply­ing this as the default val­ue it will be auto­mat­i­cal­ly gen­er­at­ed and filled in for new objects when the val­ue is not sup­plied when an INSERT is not done.
Using MySQL, the func­tion would look like:

UNHEX(REPLACE(UUID(),' - ',' ') ); 

Using Ora­cle, this func­tion is called sys_guid(); in Post­GreSQL they have sev­er­al func­tions such as uuid_generate_v4() which allow you to choose your GUID gen­er­a­tion option. Most oth­er mod­ern data­bas­es have such func­tions built in. so there shouldn't be any need for any­one to write new code.

Calculating RefId Values

Ide­al­ly, RefId val­ues should be assigned by a func­tion such as NEWID(), mapped to an exist­ing object and main­tained in a data­base. This will always lead to a sys­tem that works well, works under all cir­cum­stances and works well with oth­er appli­ca­tions from all sup­pli­ers (if they also use RefIds prop­er­ly).

RefId val­ues should nev­er be cal­cu­lat­ed from the val­ue of any data that was orig­i­nal­ly entered by hand.

For exam­ple, We’ve seen a sys­tem that cal­cu­lat­ed a Stu­dent­Per­son­al RefId val­ue by tak­ing the school num­ber and the stu­dent ID and apply­ing a func­tion to them to get a 32-char­ac­ter hexa­dec­i­mal val­ue. The intent was to avoid hav­ing to store the val­ues in a data­base. The prob­lem occurred when the data from two schools were com­bined into a sin­gle data­base. When the func­tion was applied to the two School A, Stu­dent B’s infor­ma­tion, the RefId gen­er­at­ed matched that from school X, Stu­dent Y’s infor­ma­tion. (Actu­al­ly the prob­lem was even more hard to see than this, and it took many days to track down what was hap­pen­ing – but the result was that the com­bined data­base was being cor­rupt­ed because the two stu­dents’ records were being “merged” acci­den­tal­ly).

A pos­si­ble excep­tion to this might be if the basis for the cal­cu­la­tion is some­thing that guar­an­teed to be 100% unique, all the time in all con­texts and if the func­tion for map­ping the basis for the cal­cu­la­tion to the RefId has been thor­ough­ly test­ed. Our rec­om­men­da­tion is to nev­er do this. Our expe­ri­ence has shown that every­one who has imple­ment­ed cal­cu­lat­ed RefIds has been 100% uncon­vinc­ing­ly con­fi­dent that their mod­el is dif­fer­ent, it will work out well, and in every case it has worked out ter­ri­bly.

Keeping Backup Copies

If you do what we sug­gest – that is, to ran­dom­ly gen­er­ate RefId val­ues for new SIF objects and map them to the oth­er objects they rep­re­sent – it will be very impor­tant to
make sure that the SIF agent’s map­ping tables are backed up on the same sched­ule as the application’s data­base. This is done so that if a restore of one is done, the cor­re­spond­ing map­ping tables can be done as well.

Accidental Deletes

Extreme care should be tak­en when delet­ing an object of any kind. Unfor­tu­nate­ly, once in a while, some­thing may be delet­ed by mis­take. In that case, when an out of the ordi­nary error must be cor­rect­ed, wait for an hour or more after the Delete event and then send an Add event. This should nev­er be a reg­u­lar prac­tice, but it is a work around for cas­es where human error caused a delete of one or more objects in a string of high­ly linked objects and a RefId must be pre­served.

Leave a Reply

Your email address will not be published. Required fields are marked *

Do NOT follow this link or you will be banned from the site!
Instant SSL