Draft Specification for Intellivision ROM Metadata Tag Format. By Joseph Zbiciak 14-Apr-2019 v0.13 ============================================================================== Motivation ------------------------------------------------------------------------------ The primary motivation behind this specification is to unify the existing methods for specifying Intellivision cartridge meta-data by providing a set of tags that can be appended to a cartridge ROM image an interpreted by various emulators and other Intellivision game-related tools. This specification may end up serving very diverse interests, ranging from simply allowing existing game ROMs to be catalogued efficiently, to adding useful debugging and documenting material to new titles. While I think it is useful to limit the scope of this specification in its initial form, I would like to ensure that the specification can easily adapt to future uses without breaking applications that don't comprehend the extensions. ============================================================================== Prior Art / Starting Point ------------------------------------------------------------------------------ At present, the existing emulators use a flat ROM to hold the cartridge image. This ROM is augmented with a separate configuration file which contains memory map information, and occasionally some additional information that is usually specific to the INTVPC emulator that originated the format. This format is referred to within this document as the "BIN+CFG" format. The BIN+CFG format has some desirable properties, and several undesirable properties. Desirable properties: -- The files are fairly compact. Because the binary files only contain the cartridge's ROM image, they're not overly large. Their size could be further reduced using an "8 + 2" encoding for 10-bit ROM images, although this isn't seen as necessary. (Also, it doesn't work for 16-bit ROM images anyway.) -- Most ROM images are fairly easy to load. Because a large number of Intellivision ROM images fit into a standard memory map, it is easy to load the ROM image. Often, a ROM does not require a CFG file. -- It is the current de-facto standard. All (or nearly all) cartridge images that are floating around are available in this format. Also, the cartridges on the "Intellivision Lives!" CD-ROM happen to be in this format. Undesirable properties: -- The CFG files are annoying to parse. Their format seems semi- arbitrary, and more importantly, it's poorly documented. -- Most of the functionality of the CFG file is specific to its progenitor, the INTVPC emulator. (For example, the CFG file supports directives for configuring input controls and the debugger window that are very specific to INTVPC.) -- The BIN+CFG file is not a self-contained format. The information is spread between two files. Changes in BIN+CFG format since 0.04 that affect this specification: -- jzIntv / SDK-1600 provides a standardized implementation of the CFG parser. -- New assembler directives make it possible to add variables to the CFG file. -- jzIntv / SDK-1600 now recognize certain CFG variables as holding cartridge metadata. -- Extended to support Mattel style page-flipping, RAM, WOM, etc. There are existing alternatives. When Chad Schell developed Intellicart, he devised a different format for encoding Intellivision ROMs that encapsulated much of the same information as the CFG file into a single file with the ROM image itself. Also, since the Intellicart supports interesting features such as bankswitching and writable memory, the image contains a fairly expressive map of memory attributes in the image. Because Chad's initial software output files of this format with the extension ".ROM", this is referred to as the ".ROM" (pronounced "dot-ROM") format. As with the BIN+CFG format, the .ROM format offers several desirable and undesirable properties: Desirable properties: -- Fairly compact. The .ROM header adds 53 bytes to the size of a .BIN, and another 4 bytes per ROM segment. -- Somewhat expressive. The .ROM format specifies the memory map for the cartridge, including the properties of each span in the memory map. These properties are largely orthogonal to each other, but there are of course some combinations that don't make tons of sense. Defined properties include: -- Readability -- Writeability -- Bankswitchability It is possible to define a span of memory that's writeable, but not readable, for instance. (Great for shadowing system memory in the cartridge, say if you're writing a debugger or something.) -- Checksummed. The header information and the cartridge data are both protected with a CRC-16 checksum that allows for a quick sanity check. -- Self-contained. All of the information is in a single file. -- Easily identified. The first few bytes of the .ROM file have a easily recognized structure that makes positive identification of a .ROM file very easy. -- Extendable. It turns out, in its present implementation, the Intellicart and CC3 ignore anything that's transmitted after the currently defined file boundaries. Because the header specifies the size of the ROM data, the Intellicart knows when to stop listening. That means we can add information beyond the end of file. Undesirable properties: -- Still lacks a means for specifying additional metadata. -- Slightly more complex to parse. (Although, honestly, if an SX-52 microcontroller can parse it easily, so can you.) -- The memory map granularity is limited to 2K-word boundaries (with some additional flexibility that offers 256-word granularity in special cases). No existing cartridge requires finer granularity, so this is not yet perceived as a problem. -- This format also does not comprehend Mattel-style bankswitching, though it does offer its own flavor of bankswitching support. -- A slight variation exists between Intellicart and CC3. Specifically, the "autobaud" byte differs between the two (0xA8 vs. 0x41). The remainder of this document assumes that the tags we're defining are appended to such a file. The actual existing .ROM file format is described elsewhere. This document could be considered an Annex to that file format. ============================================================================== Change History ------------------------------------------------------------------------------ Versions 0.01 .. 0.03: Lost to the sands of time Version 0.04: -- Released Feb 13th, 2001 -- Implemented in at least one version of Bliss by Kyle Davis -- Incomplete implementation in jzIntv Version 0.05: -- Released Aug 26th, 2016 -- Add Short Title and License tags (tag 0x08, 0x09) -- Update rationale to reflect evolution of BIN+CFG format. -- Removed the EGGS, MULT and MPLAYER flags in the Game Attribute flags -- Recommend de-spec'ing controller binding tag (tag 0x07), as that isn't descriptive, but rather prescriptive. -- Add flags for JLP, LTO Mapper. Version 0.06: -- Released Sep 3rd, 2016 -- Removed 4CTRL flag, leaving that field reserved Version 0.07: -- Released Jan 5th, 2017 -- Revised definition of 'intv2' config variable, to make it more like 'ecs' and 'voice' flags, and distinct from 'intv2_compat' flag. -- Fixed a couple typos Version 0.08: -- Released Apr 5th, 2017 -- Add flag to indicate TutorVision compatibility/support Version 0.09: -- Changing Info URLs to just be a list of URLs w/out description, as the CFG format doesn't really have a way of assigning a pair of strings to a config variable, and I don't really have any other plumbing that would handle pairs of strings. Version 0.10: -- Rename tutorvision_compat to tv_compat to match everything else. It turns out I had an inconsistent usage of the two across jzIntv, and that's now fixed to use the shorter name. Version 0.11: -- Add build_date, version tags. Version 0.12: -- Add UTF-8 support, including byte-stuffing for some edge cases. -- Typo fixes. Version 0.13: -- Add BSD2 string to recommended license strings, and add suggesting to include a version number when selecting a Creative Commons license. ============================================================================== Design Criteria ------------------------------------------------------------------------------ In order to guide the design of the cartridge metadata tags, I'd like to apply a set of criteria. This list may morph over time as additional considerations are applied. 1. It must NOT disturb the normal functioning of the actual Intellivision ROM itself. In speaking with Chad Schell about the Intellicart ROM format, this can be accomplished by merely appending our tag data at the end of the file. 2. All aspects of the metadata tags must be fully optional. That is, an emulator must be able to read and understand the .ROM file without interpreting the additional metadata. Also, the emulator must be able to pick-and-choose which elements of the metadata it does wish to read. 3. The specification must be readily extendable. The extensions must adhere to a general framework that allows other emulators to safely ignore the extensions (as stated in #2 above). 4. The tags should be relatively easy to parse. We're not here to make lives difficult. :-) 5. Fairly common metadata tags should be standardized up front to make sure we have the greatest amount of common functionality across all of the Intellivision emulators. 6. The encoding should be fairly compact. We don't want bloatware. 7. There should be some way of checking tag integrity. This will help identify corrupted .ROMs. 8. We should use reasonable limits on tag structure to minimize the burden on the decoding code in the emulators. The exact definition of "reasonable limits" is still somewhat vague. This bullet is a more specific statement of the general theme of criterion #4. 9. It should be fairly easy to add/remove tags at any given time. 10. It should be fairly easy to find tags in a .ROM file. ============================================================================== Desired Tags, and Related Discussion ------------------------------------------------------------------------------ Here are some proposed pieces of metadata that might be useful to store with a cartridge. These are roughly sorted by my own priorities for inclusion. High: -- Company/Publisher (eg. Mattel, Imagic, Atarisoft, Activision, CBS, etc.) -- Authors (if known. Eg. John Park Sohl wrote Astrosmash) -- Title -- Year -- Comments -- Flags (Supports/Requires/Incompatible With/Don't Care) for -- ECS -- 4 Controllers (w/ECS) (one Soccer Game works this way with ECS) -- Intellivoice -- Keyboard Component (For example, PacMan would be listed as "Incompatible With" ECS, as a real Inty crashes in this combination. Astrosmash might list "Supports ECS" because the "Sucky" feature works with Astrosmash, but "Don't Care" under "4 Controllers". "Jetson's Way With Words" would list "Requires ECS" and "Incompatible With 4 Controllers" (it *requires* the keyboard to be attached to the ECS).) Medium: -- URL to webpage with more info (eg. link to BSR's site, or author's page if 3rd party game) -- Controller definitions (human readable) -- Flags: -- 1 player / 2 player / 1 or 2 player Low: -- Known Easter Egg "reset values" -- # of known Easter Eggs for this cart. -- Values to place on both controllers (for, say, first full second) to trigger Easter Egg. -- Descriptions of each Easter egg. (For example, I know of several Easter eggs in old APh-written games. Not sure how to handle Keyboard Component Easter Eggs. There are some which require holding a pattern of keys on the keyboard!) -- Documentation? :-) -- The kitchen sink Kyle Davis has noted that some of these proposed tags might contain fairly subjective data, and so may not be appropriate for inclusion in the format. For example, "Comments" may tend to be a dumping ground for whatever the particular individual setting the tag happens to want to put in that field. I agree somewhat, but I'm not convinced I should remove the tag. I'm interested in discussion on this topic. Kyle also notes that some of the other tags might contain ephemeral information, and so may quickly get out of date. For instance, tags with URLs may get out of date as websites move, etc. This one is another difficult one to solve. Still, I'd like some way for 3rd party cart writers to put their contact info on a cartridge, even if the info is ephemeral. I wonder if this could be solved by assigning each cartridge a "globally unique identifier" of some sort, and then separately building a public database of known cartridges that this transient data could be kept in. We already sorta have this kind of identifier with the existing lists of BIN-file CRCs that are floating around. Perhaps we can use that as a starting point? ============================================================================== Mechanics: The File Format Itself ------------------------------------------------------------------------------ This is my first stab at defining the actual file format. Please take it with a grain of salt. Identification: Because the metadata tags will be placed after the end of the cartridge ROM data in the .ROM file, it would be useful to have a means to quickly identify whether a .ROM file has metadata tags attached, and where those tags start. However, since the existing .ROM file can end with an arbitrary sequence of bits, this isn't easily done. So, instead, it's necessary to parse through the .ROM file head-first to find the data. Fortunately, that's fairly simple. The following C code defines an algorithm for scanning through the .ROM file to find the starting point where the tag information would be. (I use nearly identical code in jzIntv at present.) /* ================================================================ */ /* FIND_TAG_OFS_IN_ROM -- Finds file offset of metadata tags in */ /* an extended .ROM file. */ /* ================================================================ */ long find_tags(FILE *rom_image) { int num_segments, i, seg_lo, seg_hi; long end_of_file, tag_offset; /* ------------------------------------------------------------ */ /* First, find the filesize by seeking to end-of-file. */ /* ------------------------------------------------------------ */ fseek(rom_image, 0, SEEK_END); end_of_file = ftell(rom_image); rewind(rom_image); /* ------------------------------------------------------------ */ /* Now start parsing the .ROM file. */ /* ------------------------------------------------------------ */ if (fgetc(rom_image) != 0xA8) return -1; /* not a .ROM file. */ /* ------------------------------------------------------------ */ /* Get the number of ROM segments and sanity-check it. */ /* ------------------------------------------------------------ */ num_segments = fgetc(rom_image); if (num_segments < 1) return -1; /* Invalid # of ROM segments. */ if (num_segments != (0xFF ^ fgetc(rom_image))) return -1; /* Header consistency check failed. */ /* ------------------------------------------------------------ */ /* Skip over the the ROM segments. */ /* ------------------------------------------------------------ */ for (i = 0; i < num_segments; i++) { /* -------------------------------------------------------- */ /* Read the memory range of this ROM segment and apply a */ /* simple sanity check to it. */ /* -------------------------------------------------------- */ seg_lo = fgetc(rom_image); seg_hi = fgetc(rom_image) + 1; if (seg_lo >= seg_hi) return -1; /* Address range is backwards! */ /* -------------------------------------------------------- */ /* The segment range is in terms of 256-word pages. Each */ /* 256-word page is 512 bytes. Skip over the segment and */ /* CRC-16. */ /* -------------------------------------------------------- */ if (fseek(rom_image, (seg_hi - seg_lo) * 512 + 2, SEEK_CUR) < 0 || ftell(rom_image) > end_of_file - 50) return -1; /* Bad ROM segment or truncated file. */ } /* ------------------------------------------------------------ */ /* Now skip over the enable tables. These have a well-defined */ /* and fixed size. */ /* ------------------------------------------------------------ */ fseek(rom_image, 50, seek_cur); /* ------------------------------------------------------------ */ /* Let's see where we ended up in the file. */ /* ------------------------------------------------------------ */ tag_offset = ftell(rom_image); /* ------------------------------------------------------------ */ /* Return 0 if there are no tags, or the file offset if there */ /* are any tags to be found. */ /* ------------------------------------------------------------ */ return tag_offset < end_of_file ? tag_offset : 0; } Parsing: The Metadata tags are structured as a series of separate tags. The tags themselves effectively form a linked list. The order of the list is unimportant. Each tag has a short variable-length header and 2 byte footer. The first one to four bytes are the tag length (not including header and footer). The remaining byte is the tag type #. (Type numbers are defined below.) The CRC footer is placed at the end to make it easier to process the file in a sequential fashion. This also matches the flow that the original .ROM file has. General tag structure: +----------+ | length | Byte 0 through N-1 +----------+ | type # | Byte N +----------+ | .... | Byte N + 1 | .... | | body | | .... | | .... | Byte "N + length" +----------+ | CRC16 hi | Byte "N + length + 1" +----------+ | CRC16 lo | Byte "N + length + 2" +----------+ The CRC16 covers every byte from the first byte of the 'length' to the last byte of the body, so every byte of the tag is protected. The length is encoded as a variable-length field to make the coding more dense. This reduces overhead for short-length packets, while still allowing for longer packets. The following scheme encodes the packet length: 7 6 5 4 3 2 1 0 +-----+-----+-----+-----+-----+-----+-----+-----+ | Num Bytes | 6 LSBs of Packet Length | Byte 0 +-----+-----+-----+-----+-----+-----+-----+-----+ | Bits 13 through 6 of Packet Length | Byte 1 +-----+-----+-----+-----+-----+-----+-----+-----+ | Bits 21 through 14 of Packet Length | Byte 2 +-----+-----+-----+-----+-----+-----+-----+-----+ | Bits 29 through 22 of Packet Length | Byte 3 +-----+-----+-----+-----+-----+-----+-----+-----+ The "Num Bytes" field contains a two-bit code which says how many bytes are present in the length field. "00" means only one byte, "01" means two, "10" means three, and "11" means four. This permits sizes ranging from 0 to approx 1 billion -- definitely suitable for our needs. The following C fragment can decode the length easily: int nb, ofs, len; unsigned char *img; /* ------------------------------------------------------- */ /* "img" points to .ROM image in memory. */ /* "ofs" is our current offset in that image. */ /* ------------------------------------------------------- */ nb = img[ofs] >> 6; /* Find # of bytes in length */ len = img[ofs++] & 0x3F; /* Get 6 LSBs of length. */ for (i = 0; i < nb; i++) /* Process additional bytes. */ len |= img[ofs++] << (6 + i*8); All content-carrying tags have the general structure shown above. As stated above, the tags form a linked list. The list is terminated with a "NUL tag". This tag is one byte long, and consists solely of a the length byte set to 0. There is no footer on the NUL tag. Predefined Tag Types: The .ROM metadata format defines a set of standardized tag ID numbers to reduce file storage requirements and to encourage commonality across the various emulators. I've broken down the tags into several ranges of values according to their general purpose. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 0x00 - 0x1F: General Game Information - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 0x00 "Ignore" Tag. Tags of this type should be ignored ALWAYS. This type may be of use to .ROM file editors for deleting a tag in-place, but is otherwise here as a placeholder. 0x01 Cartridge Title. Format: UTF-8 string. No terminating NUL, as length is specified by the header. 0x02 Cartridge Publisher. The body should start a single byte to specify the publisher from the set below: 0x00 -- Mattel Electronics 0x01 -- INTV Corporation 0x02 -- Imagic 0x03 -- Activision 0x04 -- Atarisoft 0x05 -- Coleco 0x06 -- CBS 0x07 -- Parker Brothers 0x08 -- Sears 0x09 -- Sega 0x0A -- Nintendo 0x0B -- Interphase 0x0C -- Digiplay 0x0D -- Dextell 0x0E -- Intellivision, Inc. (others?) 0xFF -- Other If "Other" is set, an UTF-8 string for the publisher should be provided immediately after. Again, no terminating NUL, as that is specified by the header. 0x03 Cartridge Development Credits A list of varying-length records that contain the developers' names and what they're responsible for. The first byte of the record contains a set of flags which denote what each individual was responsible for: 0x01 -- Programming 0x02 -- Game artwork (graphics) 0x04 -- Music 0x08 -- Sound effects 0x10 -- Voice samples (voice acting) 0x20 -- Documentation 0x40 -- Game concept/design 0x80 -- Box / other artwork After the flag byte is either a single "code" byte in the range 0x80 to 0xFF that specifies a name from a predefined table of names, or an null-terminated UTF-8 string. Names in the coded name table would conceivably include the BSRs, the APh folks, the Activision folks, etc... (I've compiled a separate list of 127 names that I scoured from online resources.) To support UTF-8: If the first byte of the string is >= 0x80, or if it is 0x01, then this character is escaped by stuffing an 0x01 immediately before it. This 0x01 byte needs to be unstuffed on decode. 0x04 Related URLs / Author Contact Info. Hmm... Should I have this or not? One possible idea: UTF-8 string with a URL. No terminating NUL, as length is specified by the header. URLs could be mailto's, http links, or whatever. Most useful for "indie" games such as 4-Tris. 0x05 Cartridge Release Date This entry is 1 to 8 bytes long. Shorter records provide less precision on the release date. Byte 0: Years since 1900. 0 - 255 corresponds to 1900 - 2155. Byte 1: Month (1 - 12). Byte 2: Day (1 - 31). Byte 3: Hour (0 - 23). Byte 4: Minute (0 - 59). Byte 5: Second (0 - 60). Allows for 1 leap second. Byte 6: Offset from UTC in hours (-12 - +12). Byte 7: Offset from UTC in minutes (0 - 59) For the UTC offset, +ve values mean "East of UTC", while -ve values mean "West of UTC". To express -ve values on the half-hour, such as -0130, encode as -2 hours, and +30 minutes. The minutes offset is always +ve. 0x06 Game Attribute / Compatibility Flags This tag is at least 3 bytes long and up to 5. The first 3 bytes are divided into fields as follows: 7 6 5 4 3 2 1 0 +---+---+---+---+---+---+---+---+ | ECS | rsvd | VOICE | KEYBD | byte 0 +---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+ | rsvd | rsvd | TUTOR | INTY2 | byte 1 +---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+ | reserved | byte 2 +---+---+---+---+---+---+---+---+ The defined fields correspond to the following peripherals: ECS -- The Entertainment Computer System. VOICE -- The Intellivoice Speech Synthesizer KEYBD -- The Keyboard Component (not ECS). INTY2 -- The Intellivision 2 TUTOR -- The TutorVision Each field holds a two-bit code which specifies whether the cartridge is compatible with or requires a particular peripheral. The four codes are: 00 -- Don't Care. The cartridge works with or without the peripheral. The peripheral is ignored. 01 -- Supports. The cartridge works with the peripheral and may provide extra functionality when used with this peripheral. (eg. original Mattel carts work with the "SUCKY" feature of ECS BASIC.) 10 -- Requires. The cartridge will not work without this particular peripheral. 11 -- Incompatible. The peripheral must not be used with this cartridge, because the two are not compatible. (For example, Pac-Man is incompatible with the ECS.) Bytes 3 and 4, if present, specify support for JLP Accelerator, JLP Flash, and LTO Mapper features. Note: Games that *require* these features (rather than testing for them and adapting at run time) will not run correctly on an Intellicart or CC3. 7 6 5 4 3 2 1 0 +-----+-----+------+-----+-----+-----+-----+-----+ | JLP Accel | LTOM | reserved | JLPF[9:8] | byte 3 +-----+-----+------+-----+-----+-----+-----+-----+ +------+-----+-----+-----+-----+-----+-----+-----+ | JLP Flash[7:0] | byte 4 +------+-----+-----+-----+-----+-----+-----+-----+ Byte 4 need not be present if JLP Accel is set to 00 or 01. The fields are defined as follows: LTOM When set, enables the LTO Mapper JLP Accel Enables JLP accelerators as follows. (Table below) JLP Flash Specifies minimum number of JLP Flash sectors req'd The JLP Accel field works in concert with the JLP Flash field to define JLP Flash support: 00 JLP accel disabled; flash disabled (i.e. no JLP features) 01 JLP accel enabled, default to "on"; flash disabled 10 JLP accel enabled, default to "off"; flash enabled 11 JLP accel enabled, default to "on"; flash enabled The JLP Flash field must be non-zero for JLP Accel modes 10 and 11. The JLP Flash size is in units of 1.5K sectors. It specifies the minimum required for the cartridge; however, the unit may provide more. Additional flag bytes may be specified in the future. Those additional bytes should be ignored by emulators that don't expect them. 0x07 Controller Bindings: RECOMMEND DE-SPEC'ING This tag contains human-readable descriptions of the actions bound to each key. Altogether, there are 16 different inputs defined on each controller: The 12 keypad keys, the three action buttons, and the direction disc. Given four possible controllers, there are 64 possible controller inputs. Additionally, functions may be bound to the 48 keys on the ECS keyboard. The various input sources are assigned to the following ranges of coded values: 0x80 - 0x8F -- Controller 0 inputs 0x90 - 0x9F -- Controller 1 inputs 0xA0 - 0xAF -- Controller 2 inputs 0xB0 - 0xBF -- Controller 3 inputs 0xC0 - 0xCF -- Controller 0, 1, 2, or 3 inputs 0xD0 - 0xFF -- Keyboard keys. The "0xC?" range is intended to mean "The game doesn't distinguish between controllers". For the controllers, the 16 inputs are defined like so: 0x?0 - 0x?9 -- Keypad digits 0 through 9 0x?A -- Clear 0x?B -- Enter 0x?C -- Top Action Buttons 0x?D -- Lower Left Action Button 0x?E -- Lower Right Action Button 0x?F -- Direction Disc For the ECS keyboard, I need to construct a table. For now, I plan to just read off the keyboard layout left to right, and assign codes linearly. Perhaps a different mapping makes more sense? Because multiple keys may be bound to the same action, the records are structured as lists of coded input sources followed by an ASCII string. Because the coded input sources are all >= 0x80, no separators are needed. For example, suppose the DISC moves the player, and the three Action Buttons fire. Also, suppose the game makes no distinction between the controllers. This might be coded as follows: (line-wrapped for clarity only) +------+-----+-----+-----+-----+------+------+------+ | 0xCF | 'M' | 'O' | 'V' | 'E' | 0xCC | 0xCD | 0xCE | +------+-----+-----+-----+-----+------+------+------+ +-----+-----+-----+-----+ | 'F' | 'I' | 'R' | 'E' | +-----+-----+-----+-----+ 0x08 Cartridge Abbreviated Title The abbreviated title is meant for display on a single row of the Intellivision display. It is an UTF-8 string with no NUL terminator, as length is specified by the header. No restriction is placed on length; however, this specification recommends keeping it to 18 characters for maximum compatibility. 0x09 Cartridge License Format: UTF-8, no NUL terminator, as length is specified by the header. There is no specific format for this field; however, this specification recommends using the following strings for common licenses: String License ------------ ------------------------------------------------------- GPLv2 GNU General Public License, Version 2 GPLv2+ GNU General Public License, Version 2 or later GPLv3 GNU General Public License, Version 3 GPLv3+ GNU General Public License, Version 3 or later BSD2 BSD 2 Clause BSD3 BSD 3 Clause CC CC0 Creative Commons, no restrictions CC BY Creative Commons, attribution alone CC BY-SA Creative Commons, attribution, share alike CC BY-NC Creative Commons, attribution, non-commercial use CC BY-ND Creative Commons, attribution, no derivatives CC BY-NC-SA Creative Commons, attribution, non-comm., share alike CC BY-ND-SA Creative Commons, attribution, no deriv., share alike Note that "Public Domain" isn't really a license. Consider "CC CC0" instead. For Creative Commons licenses, consider appending a version number if you wish to refer to a specific version--e.g. CC BY-ND-SA 4.0. If the program has a proprietary license, you might include a pointer to the license, or a short description. For example: "Copyright 2018 John Q. Hacker; Licensed for individual use to Robert X. Gamer" 0x0A Cartridge Description Format: UTF-8, no NUL terminator, as length is specified in the header. 0x0B Build Date Formatted identically to Release Date (0x05), but meant to capture the date a program was compiled, as opposed to released. Meant to track new games in development. 0x0C Version String This is a free-form string meant to allow authors to identify a particular build of the program in whatever way makes sense to the project. As such, this is a free-form UTF-8 string. Format: UTF-8, no NUL terminator, as length is specified in the header. 0x0D - 0x1F RESERVED - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 0x20 - 0x3F: Debugging / Development Related Information - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 0x20 Symbol Table This tag contains at least a fragment of the symbol table for the ROM image. This information could be used by a debugger to show the names of locations that the programmer used when writing his program. The assembler or other tool should append this record. The symbol table is fairly straightforward, consisting of records in the following format: +------------------+ | Addr (Hi Half) | Byte 0 +------------------+ | Addr (Lo Half) | Byte 1 +------------------+ | .... | Byte 2 | ASCIIZ String | ... | .... | Byte N - 1 +------------------+ No additional header or other information is provided. 0x21 Fine-Grain Memory Attribute Table Because the Intellivision makes no real distinction between what's code and what's data, it's useful to have a means of denoting what's what within the ROM. Again, this is primarily of use to debuggers and disassemblers, so that they correctly display code and data. Memory attributes are recorded with word granularity in this section. Because many contiguous words may have the same attribute, these attributes are encoded as spans. Each span is four bytes long, and has the following format: +------------------+ | Addr (Hi Half) | Byte 0 +------------------+ | Addr (Lo Half) | Byte 1 +---------+--------+ | FLAGS | Length | Byte 2 +---------+--------+ | Length | Byte 3 +------------------+ The length is stored as a 12-bit quantity, with the upper four bits kept in the lower four bits of Byte 2. The actual attribute flags are stored in the upper four bits of Byte 2. The following four attributes are tracked: 0x10 -- Code 0x20 -- Data 0x40 -- Double-Byte Data 0x80 -- ASCII String The list of spans is terminated with a record of all-zeros. Note that spans marked ASCII string may include some words that fall outside the normal printable-ASCII range. Emulators interpreting this tag should consider such words as "Data" when displaying the span. This allows ASCII strings with "special characters" to be marked as a single span, as it is likely that's how they'll be specified in the original source. 0x22 Line-number Mapping Table This table maps spans of addresses back to lines in an assembler listing file. I haven't decided what I want the exact format of this section to be, although I think it could be a fairly useful tool. I'm still thinking about the format of this, but I'm definitely leaving this in. 0x23 - 0x3F RESERVED - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 0x40 - 0xDF: Unassigned / RESERVED - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 0xF0 - 0xFF: Extended Tags. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 0xF0 - 0xFF Extended tags The extended tags have an additional level of header to allow other programs to at least identify the name of the tag, even if they can not use the contents. The body of the extended tag starts off with a 4 character Creator Code that specifies the "owner" of the tag. The Creator Code is intended to be a short fixed-length ASCII string that identifies the creator. All-uppercase Creator Codes are to be assigned and controlled by this specification. Mixed-cases, lowercase and non-ASCII Creator Codes are outside the control of this specification. It is up to the individual creators to specify the contents and structure of their extended tags. A total of 16 extended tags are provided, and the individual creators are allowed to assign whatever meaning they like to their own extended tags. Currently assigned creators: BSKY -- Intellivision Productions, and their various emus. INPC -- Carl Mueller Jr. -- IntvPC INWN -- John Dullea -- IntvWin INDS -- John Dullea -- IntvDOS BLIS -- Kyle Davis -- Bliss JZIN -- Joseph Zbiciak -- jzIntv MESS -- Frank Palazzolo -- M.E.S.S. Intellivision driver CART -- Chad Schell -- Intellicart ============================================================================== Mapping between ID Tags and Config Vars ------------------------------------------------------------------------------ Many of the ID tags correspond to configuration variables in the BIN+CFG format. These configuration variables appear in the [vars] section of the CFG file. Tools that support ID tags and convert between the two formats should try to preserve as much information as possible when translating between BIN+CFG and .ROM. ID Tag Config Variable -------------------------- --------------------------------------------- 0x01 Title name = "string" 0x02 Publisher publisher = value (for 0 - 7) publisher = "string" (otherwise) 0x03 Credits author = "string" (bit 0) game_art_by = "string" (bit 1) music_by = "string" (bit 2) sfx_by = "string" (bit 3) voices_by = "string" (bit 4) docs_by = "string" (bit 5) concept_by = "string" (bit 6) box_art_by = "string" (bit 7) 0x04 Related Info more_info_at = "string" 0x05 Release Date year = value release_date = value release_date = "YYYY" release_date = "YYYY-MM" release_date = "YYYY-MM-DD" release_date = "YYYY-MM-DD HH" release_date = "YYYY-MM-DD HH:MM" release_date = "YYYY-MM-DD HH:MM:SS" release_date = "YYYY-MM-DD HH:MM:SS +hh" release_date = "YYYY-MM-DD HH:MM:SS +hhmm" 0x06 Game Attrib/Compat ecs_compat = value See [1] below ecs = value See [2] below voice_compat = value See [1] below voice = value See [3] below intv2 = value See [4] below intv2_compat = value See [1] below kc_compat = value See [1] below tv_compat = value See [1] below lto_mapper = 0 or 1 jlp = value for JLP Accel jlp_flash = value for JLP Flash 0x08 Abbrev. Title short_name = "string" 0x09 License license = "string" 0x0A Description description = "string" desc = "string" 0x0B Build Date build_date = date (Accepts all the same formats as release_date) 0x0C Version version = "string" [1] The encoding for these variables in BIN+CFG is slightly different from the encoding in the ID tag. The mapping is as follows: BIN + CFG ID Tag Meaning 0 11 Incompatible with 1 00 Don't Care / Tolerates 2 01 Supports / Is Enhanced By 3 10 Requires [2] The 'ecs' variable takes a 0/1 value. That maps to ID Tag as follows: BIN + CFG ID Tag Meaning 0 00 Don't Care / Unspecified 1 10 Requires [3] The 'voice' variable takes a 0/1 value. That maps to ID Tag as follows: BIN + CFG ID Tag Meaning 0 00 Don't Care / Unspecified 1 01 Supports / Is Enhanced By [4] The 'intv2' variable takes a 0/1 value. That maps to ID Tag as follows: BIN + CFG ID Tag Meaning 0 11 Incompatible 1 00 Don't care / Unspecified ============================================================================== Closed (?) Issues ------------------------------------------------------------------------------ -- Is a "list of tags" format the right way, or should I go for more of as "Table of Contents"/"Central Directory" approach a'la Zip? DECISION: Keep the list-of-tags approach for now. -- Should I assign ranges of tags to various emulators, or leave that to the Extended Tags? These tags could be useful for emulator-specific configuration, or for storing user's preferences (eg. keybindings, etc.) DECISION: Leave this in the Extended Tag 0xF0, and add an additional 4-byte "Creator Code". All-upper creator-codes are controlled by the "Central Metadata Tag Authority", which is presently me. :-) Any mixed-case or lowercase creator codes are not controlled by this specification. -- And should I drop the "Related Info" tags? DECISION: Kyle doesn't like it, but enough other people do. It stays. -- UTF-8 is supported for many tags that previously held ASCII. ============================================================================== Open Issues ------------------------------------------------------------------------------ -- What about tags for documentation, etc.? What format? Text, RTF, HTML, MS-Word DOC? -- Zip-style compression for tags? -- Tags for overlays, box scans, etc.? -- Need to audit debugging tags against AS1600's current SMAP ASCII format. It would be nice to pull .SYM and .SMAP files into .ROM while we're here.