first_page

Songhay Publications and the Concept of the Index

What you are reading now is a Songhay Publication, this Blog. A Songhay Publication centers around the Document [GitHub]. Several Document instances can be grouped under one or more Segment [GitHub] instances. Further up the hierarchy, a Segment can group other Segment instances. Here is a diagram of the Segment, the Document (and the Fragment) to confuse almost everyone including me:

Songhay Publications Class Diagram

Some of the confusion comes from legacy Entity Framework concerns, specifically the use of ICollection<T> and other association properties. When I simplify my life with small-scale use of static JSON files, we can remove our concern for defining associations. Moreover, my small-scale use of Markdown eliminates the need (almost entirely) for Fragment. After all of this removing and eliminating, at least I can understand why I have the core interfaces, ISegment [GitHub] and IDocument [GitHub], of Songhay Publications:

Songhay Publications Interfaces

the Web Publication Index

Compiling a collection of ISegment groups into a file renders an Publication Index. Meanwhile, in the real world, there is the confusion between a “table of contents” and an Index. As of this writing in the Songhay studio, we avoid this confusion by defining an Index here and looking forward to defining a Keyword Index (which is a real-world index).

All of this fantastic thinking about the Index comes from my work on rendering a Web Publication Index in HTML. I have been working on the visual design of two kinds of Index layouts:

  1. the Index
  2. the Blog Index

Behind the visual design is the information design (too many years of my life was spent here). Both of these visual layouts are driven by a static JSON file this getting closer and closer to shapes of the models in Songhay.Publications. Today, I assume Songhay Index data will eventually look like this:

[
  {
    "segmentName": "Section 1",
    "segments": [
      {
        "segmentName": "Section 1.1",
        "segments": [
          {
            "segmentName": "Section 1.1.1"
          },
          {
            "segmentName": "Section 1.1.2"
          },
          {
            "segmentName": "Section 1.1.3"
          }
        ]
      },
      {
        "segmentName": "Section 1.2",
        "segments": [
          {
            "segmentName": "Section 1.2.1"
          },
          {
            "segmentName": "Section 1.2.2"
          },
          {
            "segmentName": "Section 1.2.3"
          }
        ]
      },
      {
        "segmentName": "Section 1.3",
        "segments": [
          {
            "segmentName": "Section 1.3.1"
          },
          {
            "segmentName": "Section 1.3.2"
          },
          {
            "segmentName": "Section 1.3.3"
          }
        ]
      }
    ]
  },
  {
    "segmentName": "Section 2",
    "segments": [
      {
        "segmentName": "Section 2.1",
        "segments": [
          {
            "segmentName": "Section 2.1.1"
          },
          {
            "segmentName": "Section 2.1.2"
          },
          {
            "segmentName": "Section 2.1.3"
          }
        ]
      },
      {
        "segmentName": "Section 2.2",
        "segments": [
          {
            "segmentName": "Section 2.2.1"
          },
          {
            "segmentName": "Section 2.2.2"
          },
          {
            "segmentName": "Section 2.2.3"
          }
        ]
      }
    ]
  }
]

The JSON above is truncated for prose reading which means that Document data has been left out. For example, Section 2.2 would have documents shaped like this:

{
  "segmentName": "Section 2.2",
  "segments": [
    {
      "segmentName": "Section 2.2.1",
      "documents": [
        {
            "title": "Article 2.2.1.1",
            "fileName ": "..",
            "modificationDate ": "..",
            "path": "..",
            "sortOrdinal": "..",
            "tag": ".."
        },
        {
            "title": "Article 2.2.1.2",
            "fileName ": "..",
            "modificationDate ": "..",
            "path": "..",
            "sortOrdinal": "..",
            "tag": ".."
        },
        {
            "title": "Article 2.2.1.3",
            "fileName ": "..",
            "modificationDate ": "..",
            "path": "..",
            "sortOrdinal": "..",
            "tag": ".."
        }
      ]
    },
    {
      "segmentName": "Section 2.2.2",
      "documents": [
        {
            "title": "Article 2.2.2.1",
            "fileName ": "..",
            "modificationDate ": "..",
            "path": "..",
            "sortOrdinal": "..",
            "tag": ".."
        },
        {
            "title": "Article 2.2.2.2",
            "fileName ": "..",
            "modificationDate ": "..",
            "path": "..",
            "sortOrdinal": "..",
            "tag": ".."
        }
      ]
    },
    {
      "segmentName": "Section 2.2.3",
      "documents": [
        {
            "title": "Article 2.2.3.1",
            "fileName ": "..",
            "modificationDate ": "..",
            "path": "..",
            "sortOrdinal": "..",
            "tag": ".."
        },
        {
            "title": "Article 2.2.3.2",
            "fileName ": "..",
            "modificationDate ": "..",
            "path": "..",
            "sortOrdinal": "..",
            "tag": ".."
        },
        {
            "title": "Article 2.2.3.3",
            "fileName ": "..",
            "modificationDate ": "..",
            "path": "..",
            "sortOrdinal": "..",
            "tag": ".."
        }
      ]
    }
  ]
}

the Songhay Publications Index in Typescript

The JSON above can be used as a guide to define a single Index entry in Typescript:

interface IndexEntry extends Partial<Segment> {
  segments?: IndexEntry[];
  documents?: Partial<Document>[];
}

I made a little manual test to verify that this interface is working.

distinguishing the search index from the Publication Index

I have opened an issue in the Publications repo because I introduced the Index concept to Songhay Publications to refer to a search Index. My work on a lunr search index at the beginning of this year was and is very exciting but lost sight of the bigger picture: a search Index must be distinguished from the Publication Index being introduced here.

Why can’t this search index be related to the Publication Index? Let us take a look at the lunr search index in use as of this writing:

[
  {
    "extract": "...",
    "clientId": "..",
    "inceptDate": "...",
    "modificationDate": "...",
    "title": "..."
  },
  {
    "extract": "...",
    "clientId": "..",
    "inceptDate": "...",
    "modificationDate": "...",
    "title": "..."
  },
]

It is clear that what we are looking at is an array of Partial<Document> with this extract property tacked on. (The inceptDate property, by the way, maps to the createDate property on Document; there is an issue out there addressing this weirdness).

Songhay Publications has a method for producing an extract from Document. So it is quite clear that a search index is a very simplified form of our new Publications Index.

Again, in Typescript we can formally and precisely define the SearchIndexEntry:

interface SearchIndexEntry extends Partial<Document> {
  string extract;
}

By writing this, I see now that I can replace the LunrIndexEntry I have defined for this Blog Publication with the more generalized SearchIndexEntry.

https://github.com/BryanWilhite/