The Fidonet Nodelist Database Project ...
[19.08.2013] I have a still open task in my work queue
since a long time that didn't make any progress.
The structure of current Fidonet Nodelist Database collects the
content of all nodelists ever published. The content has been unified.
In database structures viewpoint, this is a real relational database structure concept.
This means, whenever a new node has been listed, the new nodenumber
becomes a new record in the database. This record never changes
despite the fact, the record is listed every week in the current
nodelist. If the speed setting or another nodelist flag changes,
then a new record will be written into the database.
Another table now records all the nodelists ever imported by Date and the Julian daynumber in the year.
A third table consists of records, that links the nodelist records with the nodelists ever published.
Pointlists follows in principle the Nodelist structure, but only in
First, there exist at least 5 different Pointlist formats that are documented under FTS-5002 Pointlist Formats
Despite the fact, there are all pointlists, the structure is considerable different.
But there is one in common, what all pointlist formats shares. This is a reference to a Nodelist record as all Points listed, are listed as downlinks of a specific Node. So all pointlists have a reference either way to a Nodelist Node record - the so called Bossnode.
So here we're right in the middle of a structural definition.
Each Pointlist point record requires to be referenced to an Uplink Bossnode that is part of a Pointlist format.
The easiest ones are the two formats listed under FTS-5002 named the Boss Format list and the Poss Format list.
Both lists uses the Boss,[AKA] reference modell.
So there is an easy referencing between a Pointlist record and the related Nodelist record.
The difference in Boss Format and Poss Format list is, that the Poss Format uses the Nodelist Status Point for all Points listings where the Boss Format list allows status flags (Hold, Down, Pvt). So the Boss Format list is the most precise list in also signal a points status where the Poss Format only includes redundant informations. We know, we read a Pointlist, this is signaled by the Boss keyword lines. So all other lines are point listings. No requirement to signal and flag them as Point in each pointlist line listing.
For transfering Boss Format or Poss Format pointlists into the Nodelist Database only the reference changes from a Hub, Host reference to a Bossnode reference. All other definitions are similar to a Nodelist record listing.
At this step, we can rethink, if there is a requirement to use the Point Nodelist status as a required flag, requiering a transfer into the database or if we can retrieve this info from another info?
To make the story short:
Each AKA is splitted 4 dimensional: [Zone]:[Net]/[Node].[Point]
where a Node listing is still Point # 0.
All other Pointnumbers are considered to be Points.
So every time we want to retrieve the info wether a listing is a Node or a Point listing in the database, we can look into the Pointnumber field and if its Not Equal 0 then it is a Point listing.
So therefor we don't require the Point status defined in the Nodelist status field and we can use the Nodelist status field for the infos we can retrieve out of Boss Format lists - the real status of a Point listing.
This applies to other Pointlist Formats too, that includes a status field NE 'Point' for Point listings.
This format is also known as V7 pointlist format.
The Point Format list still has merged Nodelist entries together with Points listings. So the Point Format list can be read as a copy of a Nodelist with integrated Point records. The reference to a Bossnode is similar to a listing of a Node under a Hub or Net Host filled with Point records between each Node lines.
To signal Point lines as Point lines, the Point Format uses the Point flag as a status field parameter similar to the Poss Format listed under 2.2.1
Likewise the discussion about the Point flag in a Poss Format Point list, this status flag is required in a Point Format list, to differentiate a Node record from a Point record.
Incorporate Points from a Point Format list is easy if the focus is an individual point record. It can be easily transfered like the Boss Format or Poss Format record can be transfered.
But remember - the Point Format list also includes structural a copy of the Nodelist(!). So a Bossnode record in the Nodelist is also listed with a Nodelist line in the Point Format list.
Assuming, that a Nodelist record in a Point Format Pointlist hasn't been updated properly, we potentialy have 2 different listings for a Node in 2 lists.
The Nodelist line may differ between the Nodelist and the Pointlist listing.
This isn't a big problem, if the Sysop name in a Nodelist record doesn't much differ from the Sysop name in a node line of the Point Format list.
But considering, that a nodenumber has been re-assigned to another sysop in the Nodelist, but the Pointlist nether gets updated, now the Pointlists Node record clashes with the Nodelists Node record. And this is a problem.
To solve this problem, a strict analyse in retrieving Pointlists records is required.
Each Bossnode listing in any Pointlist requires a matching Nodelist record to keep the database clean from orphaned Bossnode and respective Points listings.
This requirement isn't limited to Point Format pointlist records.
There are some known exceptions, where Point Format pointlists have shortened Region and/or Host lines. This only requires to have an exception in analysing input, once readin Region and Host lines from Point/V7 format pointlists, that such lines ends after field 2 and doesn't follow the FTS Nodelist format lines. For the Bossnode listings, this doesn't change anything, as Bossnodes are still listed with complete lines, so they can be compared with the corrosponding Nodelist record. The only exception is to take care while build up the AKA from lines that have no comma as seperator after the Region or Host number but followed by a carriage return sequence.
Region,31 <- Carriage Return Zone: ? Host,399 <- Carriage Return Net: 399 ,666,Just_a_Node,Simcity,Joe_Grunt,9-012-345-678,9600,VFC Node: 666 AKA: 399/666 Point,1,Point_1,Simcity,Joe_Point,-Unpublished-,300, Node: 666 AKA: 399/666.1 Point,2,Point_2,Simcity,Joes_Brother,9-345-678-901,9600,V34,XX,U,Tbc Node: 666 AKA: 399/666.2
The Fakenet-Format list is a mixture of all previously listed formats.
Its a full featured Nodelist where Bossnode listings have been shifted to Host listings and Net Pointlistkeeper listings to Region listings.
So that is the reason, why this format is named a Fakenet Format list.
The Host- and Region- listings are fakes in relation to Nodelist listings of Bossnodes.
The reference to a Nodelist listing of a Bossnode is made in the systemname field, field #3 of a nodelist line within a Host line.
The definition is, that the systemname has to represent the 2-dimensional AKA of a Bossnode.
e.g. Bossnode 2:244/1120 is defined 244/1120 in the systemname field.
So a Fakenet Format Bossnode segment can be read in the following way:
So there is one requirement to identify Fakenet Format pointlists. That is identification by Filename mask.
Pointlist lines can be retrieved from Fakenet Format lists as they've
been retrieved from Boss Format lists, respective the Nodelist status field.
The reference into a real Fidonet Network or Region is given by an addtl. definition given in the Filemask definition and reference to a Fakenet Format pointlist.
e.g. Net 244, Region 24.
R23 Fakenet Format Exception
In the early days of R23 Fakenet Format pointlists development a 2nd method to reference Bossnodes to a Nodelist record has been established using the UBOSS:[2d-AKA] keyword in the Host's line as addtl. Userflag.
Extract the 2d-AKA reference from either Systemname field (default) or from the UBOSS:[2d-AKA] nodelist flags fields (R34 1992-2008).
The Fidouser Format list consists of lines with the following format of a record:
So best practice with some added value is to add the Bossnodes city to a Fidouser Format pointlist line information that maybe exact correct or nearly correct in the majority of all cases. Individual cases can be simply ignored.
This usage adds more valuable information into the transfered pointlist
into the database then all Fidouser Format pointlists get a default
e.g. City -> Fidouser
The Fidonet Nodelist is the weekly distributed phonebook of all
fidonet members. The compilation consists of the Zone 1 - Zone 6, nowadays
Zone 1 - Zone 4 segments.
Each segment can be distributed seperately, but hasn't in the past, so the complete Nodelist has been distributed week by week since years.
Pointlist distribution did happen in a different way over years. The first attempt has been made in the early 2000's to collect regional pointlist segments together into a combined Zone 2 pointlist segment that also did include a Zone 6 segment. From the other Zones until today there is no info available, that ever pointlist segment distributions did happen.
At least from the end 80's until the early 2000's Pointlist distribution evaluated several pointlist distribution formats. Each region had their own specific distribution format. As also different software used different pointlist formats, it was common, that several regions distributed more then one different pointlist formats. It also was common, that the Pointlist distribution format did change over the years. At the very end the Fido-History-Project archive collected many different Pointlist formats, multiple formats within a Regions and a masterlist - the Zone 2 Pointlist.
The Nodelist has a uniqueness, that the Pointlists never had. Nodelist records are unique in one week. Pointlists records aren't. The Fido-History-Project collected Pointlists from over 10 years of pointlist distributions, where the masterlist still did not exist. Also since Zone 2 Pointlist distribution started, not all Pointlists have been incorporated in the first few years. To have all pointlist segments incorporated into the Fido-History-Project Nodelist and Pointlist database archive, several pointlists have to be imported.
This may result in an overlapping of Pointlist records, e.g. if one pointlist record has been imported from a regional level Fakenet format pointlist and the same record in the same week has been imported from the masterlist.
We know from Nodelists, that sysops may have several different Nodelist records, e.g. for administrative AKAs and also on a multiple lines system. But all these records emerges over weeks. Multiple identical records from different lists aren't forseen, at least not from the initial database concept.
Therefor, the database construct requires an addtl. marker:
From which list does the listed record comes from?
Host 2:244/0 from Nodelist Hub 2:244/1200 from Nodelist Node 2:244/1120 from Nodelist Point 2:244/1120.1 from Region 24 Bossformat list Point 2:244/1120.1 from Region 24 Fakenet format list Point 2:244/1120.1 from Zone 2 Bossformat list
From the example above we focus on the AKA's:
From the Nodelist we collect AKA's 2:244/0.0, 2:244/1200.0, 2:244/1120.0
and from the Pointlist(s) we collected 2:244/1120.1, 2:244/1120.1 and 2:244/1120.1
so we collected one record and 2 dupe records from 3 different pointlists.
Dupe checking for Pointnumbers is one possible option to merge
several listings from several pointlists into one unique Fido-History-Project
Nodelist and Pointlist archive database record with references from several
sources. All what to do is a dupecheck of an existing Point aka for ones
week Nodelist/Pointlist listing.
In case dupes will be eliminated, how can the references be displayed?
From above example 3 listings of 2:244/1120.1 can be displayed in a table row by row. But if there is only one row, the question might be, how these references can be displayed?
Hover info similar to the Nodelist Archive overview, where exceptions are displayed in red, signals addtl. info if a user hovers the Nodelist record with the mousepointer.
hovering hovered text display ------------- ---------------------------- Host 2:244/0 from Nodelist Hub 2:244/1200 from Nodelist Node 2:244/1120 from Nodelist Point 2:244/1120.1 from R24PNT, POINTS24, Z2PNT
In the case of unified Point AKA listings, the dupe checking has to
consider, that data integrity requires to be checked, that all dupe records
lists the similar content - Sysopname is identical, City is in one local area.
The biggest problem is, if one of the records doesn't match, the question will araise, how to proceed? Eliminate the mismatching record? Mark the mismatching record as mismatched?
Before answering this question, we probably have to ask, how such mismatches can happen?
Well, a discrepancy between a Nodelist's Bossnode record and a corrosponding Pointlist Bossnode line in e.g. the Point Format pointlist may happen if the Nodelist update path is a different update path then the Pointlist segment update path. But the Bossnode reference into a Nodelist is still a detected requirement (see section 2.2.2 The Point- Format). A mismatch of one Point AKA from different pointlist sources can be assumed, that it will never happen, because support of different Pointlist format list is handled by the Pointlistkeepers. So if they'll distribute a specific pointlist format, another pointlist format list will be distributed by the same Pointlistkeeper to support different Fidonet software pointlist format styles.
Example: in the early Region 24 pointlist distribution, the Fakenet Format list distribution was the defined major distribution path. This moved over to the Bossformat distribution years later. Over the time, both formats have been distributed in parallel. The content of both files was identical, because the 2nd format has been converted from #1 distributed pointlist.
So before we complicating processing, we can assume, that the major Pointlists AKA listings from several different pointlist format lists of one distribution week for a specific area are identical. There may exist cases, where one Point AKA is listed in one pointlist, but not in another one. But this problem can be solved, by readin all different pointlist format lists to complete the archive with missed records.
There exist known cases of Point listings under different Bossnodes. So one may be listed with different Point AKA's. This doesn't differ to multiple Nodelist listings of a Bossnode who also has a Hub, a Host and/or RC listing in the Nodelist.
The only difference will be shown in the Nodelist search, where the search for a sysopname will probably give more results then a result list that follows a specific AKA search. But this phenomenon we still have with the Node's Nodelist search.
In the case of unified Pointlist AKA listings, which lists record will be give preference?
Example: Point 2:244/1120.1 listing in the Bossformat list shows
and the Fakenet format list shows
Which record we will give the preference?
The Bossformat listing? The Fakenet Format listing, as it was long time the major distribution version?
Ok, here we have to go deeper into the discovery of current database table structures.
The table NODELIST lists the different Nodelists by date, year, week
The table NL_NODE lists the different AKAs, the full sysopname and the lastname
The table NL_LINE lists the extracted different Nodelist lines ever listed with a Nodelist status, systemname, location, phone, baudrate, nodelist flags
The table NL_FRMT still has the different Lists types: Nodelist, Boss, Point, Fakenet pointlists internal value references
The table NL_ENTRY is an index table that links the Nodelist listed under NODELIST with the record in the table NL_LINE. Each record in NL_LINE has a reference to one record in table NL_NODE.
Above example with 2 Point AKA listings will result in two records in table NL_LINE. Each of these 2 records references to one record in table NL_NODE.
The NL_LINE listing isn't much a problem. But the NL_NODE listing requires further discovery:
The NL_NODE listing for Nodes consists of detailed reference links. If one moves from one Hub to another, a new NL_NODE listing will be created to reference the listing under a different Hub.
At import time, we still have the Bossnodes reference (Net/Node) for a Point, but what we didn't have is the Bossnodes reference under current Nodelist. There is still a requirement to check the referenced Bossnode in the corrosponding Nodelist. So here we have to collect not only, that the Bossnode listing in current Nodelist is valid, but also what the reference listing is. This means, under which Region a Bossnode is listed, under which Hub a Bossnode is listed. If we have collected these infos from the database, we can also write Point records with these infos into the database. Problem solved.
If there exist no updates in the Nodelists from one week to the next week, at least table NL_ENTRY will be filled with the reference links of existing Nodelist records with the new Nodelist.
The accuracy of Pointlist listings depends on the accuracy of Bossnode references to a corrosponding Nodelist record for the Bossnode.
If this is once checked, the related Pointlist Point AKA records can be assumed to be correct (minimum of mismatches).
Its more likely, that a complete Bossnode segment is orphaned then a mismatch of one Pointline will happen.
There may exist different Point sysopnames over the time under one Bossnode, but they'll evolve over the time. The Point#1 listing in 1999 maybe a different one then in 2009. By search for an AKA this we also see in Nodelists did happen. By search for a sysopname, the result list ignores listings with a different sysopname.
So there is a slight preference of unified Pointlist AKA listings. The only question to answer, how the different references gets displayed.
One of the requirements in importing pointlists is the Nodelist
referencing for accuracy of each Bossnodes segment. If a Bossnode is no
longer listed in the Nodelist, a Pointlist segment of that Bossnode
listed in a later pointlist becomes orphaned.
All Nodelists are currently still imported, so that in an import process for Pointlists, the referenced Nodelist still exist in the database.
But what happens with new distributions?
Often Pointlists will be published the day before or around the Nodelist will be created. For a region the region check will be made against the Nodelist regional segment of the upcoming Nodelist. But there are still cases known of Nodelist's late distributions, where the Nodelist arives the Pointlistkeepers after weeks. Importing the Pointlist doesn't make sense, until an accurate Nodelist has been published. So for the current Pointlist processing and if the check for a nodelist didn't find a matching Nodelist, has to be delayed until the corrosponding Nodelist arrives.
There is still a work queue processing of current Nodelists implemented in the database update process that have to be used for Pointlists import processing too. So its easy to trigger a Pointlist processing where still the corrosponding Nodelist isn't available. The simple solution is to set the Pointlist import to status delayed until the corrosponding Nodelist import has been processed. Once the Nodelist arrives and the Nodelist processing has been triggered, the delayed Pointlist import can be processed thereafter.
|1.||Each Bossnode listing in any Pointlist requires a matching Nodelist record to keep the database clean from orphaned Bossnode and respective Points listings.||
Check each pointlist revision against a corrosponding
Check each bossnode in a pointlist against a corrosponding Node record in corrosponding Nodelist revision
Each AKA is splitted 4 dimensional:
where a Node listing is still Point # 0.
All other Pointnumbers are considered to be Points.
|So therefor we don't require the Point status defined in the Nodelist status field and we can use the Nodelist status field for the infos we can retrieve out of Boss Format lists - the real status of a Point listing.|
|3.||Records in Pointlists maybe shortened||Pointlist line checks requires check for comma or line termination delimiters|
|4.||Fakenet Format pointlists identification follows ...||identification by Filename mask|
|5.||Fakenet 2d-AKA references||
from either Systemname field (default) -or-
from the UBOSS:[2d-AKA] nodelist flags fields
|6.||Fidouser Format data completion||add the Bossnodes city and probably other infos (Phone, Baud, Nodelistflags) from the Bossnodes Nodelist info to a Fidouser Format pointlist line|
|7.||Unified Point AKA listings per Nodelist week||Check if Point AKA exist otherwise add new record|
|8.||Pointlists processing requires queuing||If a corrosponding Nodelist has been processed the Pointlist can be imported, otherwise processing has to be delayed|
|9.||Referencing data source||Add a data source type reference to each Nodelist and Pointlist line (Nodelist, Points24, Z2PNT)|
Started with development of pointlist imports using rules from chapter 3 beginning with Z2PNT import of newly new created pointlists.
By setting relation from pointlist to a related nodelist, first idea was to use the latest available nodelist that can be easily identified by a max record id identification.
But what's with older pointlists?
eg. I've started fidonet with a pointnumber in net 2:247, got the first nodenumber in 2:247 and moved over to the new installed net 2:249. Back in 1993 the net changed to 2:244 and in my latest fidonet history I've moved to net 2:240. The latest nodelist entry will be in net 2:240 but my first pointlistings related to net 2:247.
Pointlist to nodelist relation lookup has to identify the related bossnode line in the corrosponding nodelist by year and daynumber. Thats easy for all nodelists starting #165 in year 1991.
But what happens with pointlists before #165/1991?
Remarks from the development:
// get record with max id_nl_node // gilt fuer die "aktuellsten" Eintraege // beim Nachtrag von "alten" Records mag das aber nicht stimmen, // wenn die relevante Nodelist noch nicht existiert ... // hierzu muessten saemtliche Node- und Z2PNT Listen neu eingelesen werden // um ein stimmiges Ergebnis zu erhalten bzgl. Region + Hub Zuordnung // ggf. gibt es noch eine andere Tabelle, die die id_nl_node Zuordnung zur relevanten // Nodelist aus Nr + Jahr liefert ??!? // nodelist: id_nodelist,id_nl_frmt,id_nl_type,year,day,date // nodelist->id_nodelist => // nl_entry->id_nodelist/id_nl_line => // nl_line->id_nl_line -> id_nl_node => // nl_node->id_nl_node // aaaaaaabbbbbbeer // in den Anfaengen gibt es keine ausreichenden Nodelisten zu Pointlisten: // existing and imported Nodelists demgegenueber stehen R24 Pointlists
|und erst ab|
|1990 180 gibt es fortlaufende Nodelisten|
|und ab 1991 165 fortlaufende Pointlisten|
|1991 165||1991 165|
|dies gilt fuer r24 pointlists|
|1991 165||1991 165|
|1991 ...||1991 ...|
|1991 361||1991 361|
Except for R24 pointlist 1989 342 a related nodelist can be found and identified.
For R24 pointlist 1989 342 a related nodelist may be nodelist 1989 216 or nodelist 1990 026.
This can be defined as a single exception. If a nodelist lookup in 1989 216 cannot be found, a second lookup into nodelist 1990 026 may become successful for a related bossnode entry in a corrosponding nodelist.
|© 2003-2022 by Ulrich Schroeter||0000|