Kodewerx - User contributions [en]

RLM

2020-12-28T06:28:11Z

Parasyte:

RLM is a program designed to "unprotect" Super Mario World ROM hacks that were "protected" with the Lunar Magic level editor. RLM stands for "Recover Lunar Magic". Also seen in the source code, "Rape Lunar Magic" was once considered.

==Version History==

v1.3 (12-26-09):
* Fixed yet another last-known-problem with (what I believe is) "Super ExGFX".
* Support for drag-n-drop (see "Using RLM").
* Readme cleanup.

v1.2 (04-18-05):
* Fixed the last known problem with OverWorld decryption.

v1.1 (01-16-05):
* Fixed a problem with OverWorld decryption.
* Changed copyright years from 2003-2004 to 2003-2005.
* Added a version number define to make version changes easier.

v1.0 (01-13-05):
* Initial release!

==Lunar Magic Protection==

The protection scheme used by Lunar Magic is simple albeit effective against inexperienced hackers. When protecting a hack, the editor will write a few bytes into the SMC header as well as the SNES ROM header which it uses to signal to itself that the ROM is protected. Reversing these changes is enough to bypass the "Lunar Magic cannot open this ROM" message, but you will run into trouble when you open the edited stages.

Almost everything that Lunar Magic edits will be encrypted with a simple XOR operation, when a hack is protected; unedited stages and objects will remain in their original, unencrypted state.

The first thing encrypted is the stage pointers. To anyone looking in a hex editor, they appear to point to invalid stage data (because they do). Decrypting these pointers shows you were to find the valid "RATS" headers, as well as encrypted stage data. At this point, decrypting the stage data is enough to open the hack in Lunar Magic. However, the ROM will not work on an SNES or an emulator, due to an assembly hack in place which will try to decrypt the already-decrypted stage data. Reverting the assembly hack will make the ROM playable.

Other edited objects in the ROM are also encrypted in a similar manner. The overworld is a particularly big example. I believe the "ExGFX" are also encrypted.

==RLM Specifics==

RLM contains code which allows it to handle LoROM and ExHiROM SNES ROMs in the SMC format. The main decryption loop, decryptLunar(), is a very simplified Super Mario World stage data parser. Most data blocks in this stage format are 3 bytes, but some are 4 or 5 bytes. RLM has to understand this concept (at the very least) in order to decrypt the stage data properly; Only the first two bytes of each block are encrypted.

==Download==

RLM is available under the terms of the [https://www.gnu.org/licenses/old-licenses/gpl-2.0.html GNU General Public License]. You can download the zip/gz/bz2 compressed source code from [https://github.com/parasyte/rlm github] or if you have git, clone the source code repository with the following command:

git clone https://github.com/parasyte/rlm.git

==Usage==

$ ./rlm
RLM (Recover Lunar Magic) - v1.3
Copyright 2003-2009 Parasyte (parasyte@kodewerx.org)
http://www.kodewerx.org/

Usage: ./rlm <in-file.smc> [out-file.smc]

Where <in-file.smc> is the name of the protected Super Mario World ROM, and [out-file.smc] is the file name that you want RLM to create with the unprotected data. Note that [out-file.smc] is optional; if not specified, it will create a new file with the extension of the input file name changed to ".unlocked.smc". For example, "in-file.smc" becomes "in-file.unlocked.smc".

[[Category:Developer Documentation]]

Snesrc

2020-12-28T06:22:10Z

Parasyte:

"SNESRC" is an acronym meaning Super Nintendo Entertainment System Re-Compiler. When referring to the program, it is spelled with lowercase letters: snesrc.

Snesrc was an experiment intended to reproduce the complete assembler source code for SNES games. Its main algorithm is a recursive half-emulation of the 65c816 CPU. The theory is that doing this, it can trace all possible program paths to create a "map" of the ROM file which can disassemble the file with proper code/data separation.

==Downloads==

Snesrc is released under the terms of the [https://www.gnu.org/licenses/old-licenses/gpl-2.0.html GNU General Public License 2.0]. The source code can be downloaded from [https://github.com/parasyte/snesrc github]. You can download the source with git using the following clone command:

git clone https://github.com/parasyte/snesrc.git

==The Map==

The map contains information about whether a byte is part of an instruction; is read as data by another instruction; is the destination of a branch, jump or pointer; or was fixed or not marked 'clean'. The map also contains some information about the state of the CPU when that instruction was encountered.

==The Algorithm==

Because of the recursive nature of the algorithm, it will never trace over a byte twice. This aspect also gives it some severe design flaws. The biggest one being that the CPU status attempts to evolve throughout the recursive tree. A branch- or jump-instruction creates a new branch (fork) within the recursive tree; when that branch ends, the algorithm backs out to the last fork and continues running.

The three conditions which can end a branch are: interrupt returns, hitting an instruction which was already traced, or hitting an instruction which appears to be invalid. In the latter case, the algorithm attempts to make an adjustment to the CPU state and re-run over the last branch of instructions (which are still marked 'dirty'). If the branch then ends on one of the other two conditions, the map is marked 'clean' (all dirty bits are removed) and the algorithm continues from the last fork.

==Problem Areas==

The biggest concern with the 65c816 is that its instruction sizes can change dynamically as the program runs (similar to the ARM line of CPUs). This makes it difficult to disassemble the program; let alone to separate code from data. Therefore, it is possible for a string of instructions to be interpreted differently depending on the CPU state when the string is executed. The CPU can have four possible states which can change instruction sizes. These states depend upon the three CPU status bits which change the size of the accumulator and index registers: ''I'', ''M'', and ''E'':

* When ''I'' is set, the index registers X and Y are 8-bit. When clear, X and Y are 16-bit.
* When ''M'' is set, the accumulator register is 8-bit. When clear, the accumulator is 16-bit.
* When ''E'' is set, both the index registers X and Y, and accumulator register are 8-bit. When clear, X, Y and accumulator sizes depend on the states of ''I'' and ''M'', respectively.

Another common problem area encountered is with jump tables: seen with JMP ($xxxx,X) and JSR ($xxxx,X) instructions, where $xxxx is often a ROM address listing several pointers. Some of these tables can be found and recursed into by snesrc, but not in all cases.

When $xxxx is a RAM address, we have no idea what it's trying to jump to; this is also a problem for JMP (indirect) and JMP [indirect long] instructions. (See [[#Known Bugs|Known Bugs]].) This is common when a function pointer is sent to a subroutine as an argument, or when a pointer needs to be modified in some way. These corner cases are difficult to anticipate and handle correctly. Especially with this particular algorithm.

The recursion itself is problematic; RTS/RTL instructions do not back out to the last fork, but instead try to pull the return address off the stack. This leaves the CPU state as it was when it left the subroutine (technically correct). But we still run into strange cases where the emulator will throw a warning or an error (usually an invalid instruction). This seems to be caused by incorrectly following the program path, but expecting the CPU status to be correct in all cases.

===Known Bugs===

* JMP (indirect) and JMP [indirect long] instructions are not supported.
* The ''-fcop'', ''-fstp'', and ''-fwdm'' command line arguments are ignored.
* Needs a command line argument to force the header size (to 0 or 512).

==Usage==

$ ./snesrc
snesrc - The SNES Recompiler v0.01
Copyright 2005 Parasyte

Usage:
snesrc [options] <input file> <output dir>

Options:
-l: Force LoROM
-h: Force HiROM
-r<n>: Set pointer table range check size
-fbrk: Attempt to fix code which reaches a BRK instruction
-fcop: Attempt to fix code which reaches a COP instruction
-fstp: Attempt to fix code which reaches a STP instruction
-fwdm: Attempt to fix code which reaches a WDM instruction

I used the [https://www.pouet.net/prod.php?which=15713 2.68 MHz Demo by Abandon] for most of the testing. It's not possible to force the header size, so you will want to chop off the first 512 bytes of the .smc file using a hex editor. (This 512 bytes is the SMC header; it's mostly null bytes). This ROM file is also not padded correctly, so you must force snesrc to read the ROM with the LoROM algorithm, using the -l command line argument.

$ ./snesrc -l -fbrk 2mhz.smc 2mhz

This will create a new subdirectory called 2mhz/ where you will find the disassembled bank files (bankXX.asm) and the disassembler pass logs: pass1.log, pass3.log. Since Pass 2 is specific to flushing map bytes, it does not currently log anything.

==Case Study==

This experiment shows some promise that it is possible to reproduce valid (although not always complete) assembler source code out of simple SNES ROMs. In my tests, I've seen the most success with small "public domain" SNES ROMs. (Technically, these are homebrew ROMs, and are subject to the author's copyright as applicable.)

It would be better to replace the recursive tree with a [https://en.wikipedia.org/wiki/FIFO FIFO] system. Thus, any time a conditional branch or subroutine call is found, the destination address and CPU status are placed on top of the FIFO. Pull that information out of the FIFO for each run until the FIFO is empty. This would make tracing much easier to debug than a maze-like recursion tree. It should also be much closer to the path taken by an assembler as it was creating the ROM.

[[Category:Developer_Documentation]]

Syndrome

2020-12-28T06:18:51Z

Parasyte:

Syndrome is a level editor for the Mega Man 7 SNES game. Syndrome has undergone 2 complete rewrites since the project began (three iterations in all). The most recent rewrite makes it platform independent, with main development taking place mostly on Mac OS X.

==History==

Work on Syndrome began on September 25, 2002 as a Visual Studio 6 MFC project. The screenshots below depict two different builds of the interface. On the left, an older version showing the first time a complete screen was rendered. On the right, a newer version with some tile issues fixed (also with transparency rendered in pink) and an empty sidebar added (once planned to become a toolbox):

[[Image:Syndrome.png]] [[Image:Syndrome2.png]]

On May 23, 2004, I decided to drop MFC (and C++, for that matter) and entirely rewrote the program in C for MinGW. This iteration saw a completely new interface, using an [https://en.wikipedia.org/wiki/Multiple_document_interface MDI] layout and resizable windows. A console window was also included which would output debug information:

[[Image:Syn1.png|800px]] [[Image:Syn2.png|800px]]

[[Image:Syn3.png]] [[Image:Syn4.png]]

On February 8, 2009, over 6 years after the project started, I decided to drop the Win32 dependencies completely. This second rewrite (or third iteration) is currently in progress, written in JavaScript for XULRunner. The plan is to eventually include C++ modules for efficiency reasons.

On December 21, 2020, after an extra decade of dormancy, the project was barely resurrected from extinction since Bitbucket decided to hard-deprecate and delete all Mercurial repos. I was able to find a copy on the [https://web.archive.org/web/20141120183658/https://bitbucket.org/parasyte/syndrome Internet Wayback Machine] and created a new repo on github.

==Download==

Syndrome source code is available on github: [https://github.com/parasyte/syndrome]. You can also clone the git repository with the following command:

$ git clone https://github.com/parasyte/syndrome.git

==Building==

The current source tree is a minimal implementation of a XULRunner application; the XUL, CSS, and JS sources are available in the /src subdirectory, and two shell scripts are included for launching the application. Note that XULRunner has been retired by Mozilla. Some modern replacement technologies are [https://nwjs.io/ NW.js] and [https://github.com/electron/electron Electron]. A light weight alternative is [https://github.com/webview/webview webview].

The linux shell script requires XULRunner to be installed and available in your PATH. If you are on a recent version of Ubuntu, you already have everything needed to try it out:

$ ./syndrome-linux

The Mac OS X shell script will build an application bundle and place it into a new /Applications/Kodewerx directory. Then it will run the application. To use this, you must download a recent copy of the [https://releases.mozilla.org/pub/mozilla.org/xulrunner/releases/ XULRunner runtime]. At this time, the most current release is version 1.9.2.10:

$ ./syndrome-osx

For Windows, just double click the "syndrome-win" shortcut. You must have Firefox 3.0 (or newer) installed in the default location (C:\Program Files\Mozilla Firefox) for the shortcut to work. If you have Firefox installed to a different location, you can change the shortcut to point to the firefox.exe file by right clicking the shortcut and choosing "Properties".

XULRunner 1.9.1 adds some much needed functionality such as better support for the HTML Canvas element and much faster JavaScript execution time. Version 1.9.2 is also in the works which adds even more interesting features, such as scaling canvases (and images) with the nearest neighbor algorithm to prevent ugly blurring artifacts; perfect for low resolution pixel art. This can be done with the CSS [https://developer.mozilla.org/En/CSS/Image-rendering image-rendering: -moz-crisp-edges;], or the [https://developer.mozilla.org/en/Canvas_tutorial/Using_images#Controlling_image_scaling_behavior mozImageSmoothingEnabled] property on canvases.

We need a proper build environment using configure/make on Linux and Mac OS X, and a Visual Studio project for Windows. [https://www.cmake.org/ CMake] might be a good choice. Though recently, I am really liking [https://www.scons.org/ SCons].

==Syndrome==

===The Name===

I am often asked why the editor is called Syndrome; the name isn't typical of ROM hacking tools. The word 'syndrome' itself does not directly relate to Mega Man 7 or the Mega Man series in any way. The main reason for this is just to be different! But that doesn't explain where the name came from.

I'm not a big fan of the Mega Man series, to be entirely honest. (Writing an editor for this game was a suggestion that I took one day when bored and feeling an intense urge to build an editor.) I'm not familiar with many of the characters in the series, even; however, I knew that there was a character ... somewhere ... named Sigma. Sigma is a nice name. Almost as fancy as Epsilon. But that's another story.

The words 'sigma' and 'syndrome' (in my head, at least) have a very synesthetic feeling about them. Such that when I hear 'sigma' in my head, I also heard 'syndrome'. And that's the ultimately boring story about how the project came to be known as Syndrome!

===The Reasoning Behind XULRunner===

XULRunner is a nice platform; especially for people who are already familiar with doing web application programming. It's easy for web developers to pick up a XULRunner app and modify it to fit their needs. However, it's not very easy for a web developer to start a XULRunner application from scratch -- that's what XULRunner developers are for.

Briefly, I chose XULRunner as my cross platform framework because I wanted to avoid A) non-native looking widgets (rules out GTK+) and B) C++ (rules out wxWidgets and a whole lot of other similar toolkits). On top of that, I realized the potential of the XULRunner platform in Firefox (extensions, themes, internationalization, accessibility, ...) and I feel that Syndrome can benefit from these technologies.

Back to web developers; Most people who can write an application can also write a web page, but not vice-versa. One of the things missing from the ROM hacking communities is a firm grasp on open source software. Granted, it has been getting better in recent years. One thing I would like to accomplish with Syndrome is creating an open ROM hacking editor framework; something which contains general utilities useful to ROM hacking editors. (In other words ... a really simple place to start when someone wants to make their own editor. And if they know HTML and some JavaScript, they can probably get going in the right direction with Syndrome.)

===Mega Man 7 Information===

Syndrome is hard-coded to read data out of the US Mega Man 7 ROM. Other regions probably will not work. The first thing it does after opening the ROM file is detecting the ROM mapping mode. (This is part of the 'general utilities' described above!) Mega Man 7 is strictly a HiROM-mapped SNES ROM; but Syndrome does not care. It will detect the mapping mode anyway. This makes the ROM reading class general purpose. (This class also handles SMC format ROMs with or without the SMC header -- another win for modularity.)

The next thing Syndrome does is read the ROM title from the header, but in its current state it will not verify this information (currently used for debugging only).

Finally, it begins to "build" the first stage. This is a convoluted process (due to compression) which ends up with a very large HTML canvas (4864 x 1536 px) and a very basic SNES video mode buffer. The video mode buffer is a collection of: tiles, palettes, tile maps. These buffers are the same format as used by SNES.

The next logical step is drawing pixels to the canvas using the tile maps, tiles, and palettes. Unfortunately, trying to do that in JavaScript alone was far too slow in my tests. I'm trying to learn the XULRunner XPCOM system well enough to write the rendering code in C++.

====Compression====

The compression algorithm used on tiles is an [https://en.wikipedia.org/wiki/Lempel-Ziv-Storer-Szymanski LZSS] derivative. It may also be used on other kinds of data.

Another intermediate form of compression used is often referred to as "Tile Squaroid Assembler" ("TSA") in the ROM hacking community. I dislike this name (in part because "Squaroid" is not a word in the English language) but do not have a better one to suggest. The method resembles a "[https://en.wikipedia.org/wiki/Dictionary_coder Dictionary Table Encoded]" ("DTE") array; a very simple form of [https://en.wikipedia.org/wiki/Hash_tables Hash Table].

The idea behind the DTE/hash table is to represent several bytes (a "block") of information as a smaller block (typically only a single byte) of information. As an example, it is easy to represent a 2x2 square of tiles in only one byte; the byte used can therefore reference a total of 256 different 2x2 block combinations. The combinations available are first compiled into a big list called a dictionary. To decode a byte ''n'' into the resulting 2x2 square of tiles, you simply have to look up the ''n''th entry in the dictionary.

Given each tile in the background layer tile map consumes 2 bytes, a stage the size of the MM7 intro level (4864 x 768 px for just the foreground) would require a tile map that's 116,736 bytes long. (Nearly 5% of the full ROM! And this is probably the smallest stage in the game.) That does not even include the tiles or palettes.

One way to "compress" a stage this size into something more reasonable is by breaking everything into smaller 'chunks' and indexing the crap out of it. To illustrate, we can start at the 'stage' level; a stage of 4864 x 768 px can be broken into chunks of 256 x 256 px 'rooms', giving a total map size of 19 x 3 'rooms' (that's 57 bytes!)

Now break each 'room' into 8 x 8 'structures', giving a total of 64 bytes per 'room'.

Break each 'structure' into 2 x 2 'blocks', giving a total of 8 bytes per 'structure'.

Break each 'block' into 2 x 2 tiles, giving a total of 8 bytes per 'block'.

...

You might imagine how this 'chunkifying' saves a whole lot of space, because it reduces redundancy (the main method of data compression, after all). With a stage like the intro (57 total rooms in the map) you might only use a maximum of 15 different kinds of rooms. And the same goes for creating rooms, structures, and blocks.

===The Editor Interface===

This is an area that I've spent a good deal of time exploring. Over 6 years worth of exploration, in fact. I think the best interface for this kind of game (considering the compression used, as explained above) will be to edit only one piece of the overall 'stage' at a time. For instance, the editor will start up in map edit mode on stage 0 (intro). In the map edit mode, the main canvas will display the complete stage. A toolbox will allow the user to select what kind of tool to use (in this example, a 'stamp' tool, to make changes). And below the toolbox, a palette for selecting one of the available 'rooms': Select an available room from the palette, and with the stamp tool, click on the map to 'stamp' that room into the location where the mouse is hovering.

Now this is fine for moving rooms around the map. But to get finer grained control over your editing, you would want to enter 'room edit mode' by double clicking on one of the rooms displayed in the palette. This will change the main canvas to display the room you double clicked, and the palette will change to display all available structures which can be placed into that room.

Double clicking a structure will then go one level deeper into structure edit mode; structure in the main canvas, and all available blocks in the palette.

And again, enter block edit mode by double clicking, and finally, tile edit mode by double clicking a tile in the block edit mode's palette. (The tile editing mode has a palette containing ... well, a color palette! Would it also be reasonable to enter a 'palette editing mode' by double clicking one of the available colors?)

This represents an overall of 5-6 nesting levels (map, room, structure, block, tile, ?palette?). To make this easy to use, 'back and forward' buttons should be made available. Links to all higher levels should also be available, similar to the GTK+ file picker window.

==TODO==

Wiki Article:
* Mockup screenshots to help illustrate conceptual ideas such as the nested-style user interface. Use [https://balsamiq.com/wireframes/ Balsamiq Wireframes] to create the mockups.

Syndrome:
* Write the rendering module in C as a dynamic library; call exported functions with [https://developer.mozilla.org/en/js-ctypes js-ctypes].
* Extend the GetDataX() functions: maybe GetDataPtrX() ? or functions to get arrays...

[[Category:Developer_Documentation]]

Kwiki

2020-12-28T05:53:01Z

Parasyte:

Welcome to the Kodewerx wiki. While browsing its contents, remember that ''you'' can make changes to anything you see here. It might be as simple as bad spelling or grammar, or it could be an entirely new page that you create; it's up to you to help the Kodewerx wiki grow into a valuable resource for hackers and programmers alike.

To get you started, here's what we have going on right now:

==Kodewerx Projects==

The main page for "Kodewerx Projects" can be found at the [[:Category:Developer Documentation|Developer Documentation]] page. Most of these projects are immature, and not yet usable for ordinary users. (We need your help!) Source code is available at https://github.com/parasyte, https://github.com/blipjoy, and https://github.com/rust-console/cargo-n64

-[[User:Parasyte|Parasyte]] 06:57, 21 October 2010 (UTC)

Debugging Modern Computer Architectures

2020-12-28T05:51:37Z

Parasyte: Fix link

How does one solve the problems with current debuggers? First, by identifying those problems. Next by addressing them. Finally, in implementation.

==Problems with integrated debuggers==

So what are the problems with debuggers integrated in today's emulators? Well, for one thing, they are integrated. This can cause portability problems, in many cases (I am ashamed to admit my guilt in perpetuating this problem, by writing debuggers that vendor lock users to the Windows operating system). It can also cause undue stress for debugger developers. We are a lazy species, and we do not like rewriting the same debugger multiple times, attempting to port our work to a newer, better emulator, or porting it to a completely new emulated architecture. And then there is the problem of features, or lack thereof. Some hackers and homebrewers need specialized features in their debuggers.

==Solutions==

===Modularity===

Modularity is one possible solution to these problems.

The first thing to do is segregate the low-level debug primitives (functions and whatnot) from the user interface; make the interface modular, interchangeable with any interface. Then you define how the debug primitives interact with the interface via a communications link; make the communications link modular, able to establish communication using any number of interchangeable modules for TCP/IP sockets, operating system pipes, RS232, USB, etc. Next, you define the protocol; make the protocol modular, a 'universal language' that describes generic debug primitives, and allow it to be extensible as necessary. Finally, you define those debug primitives and provide a base implementation that can be expanded if required. However, a well-defined set of primitives is unlikely to need expansion for anything but the most exotic architecture configurations.

===Standardization===

What does all of this mean? Where does it leave us, the debugger developers? And where does it place the users, the hackers, and the homebrew developers?

It means that the debugger developers can implement an accepted standard (accepted being the keyword) for debugger support within not only emulators, but any kind of virtual machine or interpreted byte code in any kind of program. It could be a simple set of debug primitives (in a static or linked library, for example) added by an emulator author (or emulator extender) that connects to a debugger interface of the user's choice. The interface might be highly specialized for a particular architecture, or it might be very complex and advanced with universal support for many architectures.

This would put a large number of options into the hands of users.

===The protocol===

Now let me try to get a more solid description of this idea out there. The number one underlying technology to be assessed to make any of this work is simply the protocol. That means, a formal description of how a target (an emulator, or other program wishing to use debugger functionality) talks to an interface (a separate program designed to give the user direct access to the debug primitives and link them together in ways that provide many very advanced features ... such as stepping backwards in architecture-time).

This would probably be a command reference which supplies things like:

# A description of the architecture (the emulated system, like NES). This description would include the number of CPUs available, the type of the CPUs, endianness, memory maps as accessible by the CPU, memory maps not accessible to the CPU, etc. Basically a complete virtual model of the architecture.
# Debug primitives: breakpoints and stepping functionality; read/write access to the memory maps, cpu registers and statuses, and access to internal hardware registers; interrupt and exception handling; scripted macros with callback functions; essentially all of the basic functions which the interface can use to procedurally create high-level features.
# Extensibility; able to provide expansions to architecture descriptions, debug primitives, and other specialty features.

With such a protocol in place, the interface can do the rest of the high-level work; disassembling, video memory viewing and modification, hex editing, cheat searching and management, etc.

==Implementation==

There is currently an ongoing effort to research the possibilities defined in this article. You can read more about the current progress of this project, along with related details, on the [[Universal Debugger Project]] page.

==Closing statements==

I'm hoping this has been verbose enough that you all understand where I am coming from, but not too verbose that I've created confusion or completely went the wrong direction in the discussion.

Bottom line is, I think we only need to agree on one thing: the protocol. If you refuse to believe that, and only want to do your own thing with your own emulator, that's quite alright. But if you want to reap the benefits of interchangeable debugger interfaces [pick your favorite, or just choose the right one for the job at hand] that are platform-independent [can run on any host operating system, even a completely different machine from the target emulator; not at all bound to the target emulator] and potentially architecture-independent [capable of debugging NES, Genesis, PS2, Wii, Java, brainf**k, the custom scripting language in your new game, you name it!] then I say let's work some crazy Voodoo and invent ourselves a standard for modern debugging!

That said, [https://tools.ietf.org/html/rfc909 RFC-909, Loader Debugger Protocol] looks like a good place to start.

[[Category:Developer_Documentation]]

Snesrc

2020-12-28T05:47:54Z

Parasyte: HTTPS

"SNESRC" is an acronym meaning Super Nintendo Entertainment System Re-Compiler. When referring to the program, it is spelled with lowercase letters: snesrc.

Snesrc was an experiment intended to reproduce the complete assembler source code for SNES games. Its main algorithm is a recursive half-emulation of the 65c816 CPU. The theory is that doing this, it can trace all possible program paths to create a "map" of the ROM file which can disassemble the file with proper code/data separation.

==Downloads==

Snesrc is released under the terms of the [https://www.gnu.org/licenses/old-licenses/gpl-2.0.html GNU General Public License 2.0]. The source code can be downloaded from [https://github.com/parasyte/snesrc github]. You can download the source with git using the following clone command:

git clone https://github.com/parasyte/snesrc

==The Map==

The map contains information about whether a byte is part of an instruction; is read as data by another instruction; is the destination of a branch, jump or pointer; or was fixed or not marked 'clean'. The map also contains some information about the state of the CPU when that instruction was encountered.

==The Algorithm==

Because of the recursive nature of the algorithm, it will never trace over a byte twice. This aspect also gives it some severe design flaws. The biggest one being that the CPU status attempts to evolve throughout the recursive tree. A branch- or jump-instruction creates a new branch (fork) within the recursive tree; when that branch ends, the algorithm backs out to the last fork and continues running.

The three conditions which can end a branch are: interrupt returns, hitting an instruction which was already traced, or hitting an instruction which appears to be invalid. In the latter case, the algorithm attempts to make an adjustment to the CPU state and re-run over the last branch of instructions (which are still marked 'dirty'). If the branch then ends on one of the other two conditions, the map is marked 'clean' (all dirty bits are removed) and the algorithm continues from the last fork.

==Problem Areas==

The biggest concern with the 65c816 is that its instruction sizes can change dynamically as the program runs (similar to the ARM line of CPUs). This makes it difficult to disassemble the program; let alone to separate code from data. Therefore, it is possible for a string of instructions to be interpreted differently depending on the CPU state when the string is executed. The CPU can have four possible states which can change instruction sizes. These states depend upon the three CPU status bits which change the size of the accumulator and index registers: ''I'', ''M'', and ''E'':

* When ''I'' is set, the index registers X and Y are 8-bit. When clear, X and Y are 16-bit.
* When ''M'' is set, the accumulator register is 8-bit. When clear, the accumulator is 16-bit.
* When ''E'' is set, both the index registers X and Y, and accumulator register are 8-bit. When clear, X, Y and accumulator sizes depend on the states of ''I'' and ''M'', respectively.

Another common problem area encountered is with jump tables: seen with JMP ($xxxx,X) and JSR ($xxxx,X) instructions, where $xxxx is often a ROM address listing several pointers. Some of these tables can be found and recursed into by snesrc, but not in all cases.

When $xxxx is a RAM address, we have no idea what it's trying to jump to; this is also a problem for JMP (indirect) and JMP [indirect long] instructions. (See [[#Known Bugs|Known Bugs]].) This is common when a function pointer is sent to a subroutine as an argument, or when a pointer needs to be modified in some way. These corner cases are difficult to anticipate and handle correctly. Especially with this particular algorithm.

The recursion itself is problematic; RTS/RTL instructions do not back out to the last fork, but instead try to pull the return address off the stack. This leaves the CPU state as it was when it left the subroutine (technically correct). But we still run into strange cases where the emulator will throw a warning or an error (usually an invalid instruction). This seems to be caused by incorrectly following the program path, but expecting the CPU status to be correct in all cases.

===Known Bugs===

* JMP (indirect) and JMP [indirect long] instructions are not supported.
* The ''-fcop'', ''-fstp'', and ''-fwdm'' command line arguments are ignored.
* Needs a command line argument to force the header size (to 0 or 512).

==Usage==

$ ./snesrc
snesrc - The SNES Recompiler v0.01
Copyright 2005 Parasyte

Usage:
snesrc [options] <input file> <output dir>

Options:
-l: Force LoROM
-h: Force HiROM
-r<n>: Set pointer table range check size
-fbrk: Attempt to fix code which reaches a BRK instruction
-fcop: Attempt to fix code which reaches a COP instruction
-fstp: Attempt to fix code which reaches a STP instruction
-fwdm: Attempt to fix code which reaches a WDM instruction

I used the [https://www.pouet.net/prod.php?which=15713 2.68 MHz Demo by Abandon] for most of the testing. It's not possible to force the header size, so you will want to chop off the first 512 bytes of the .smc file using a hex editor. (This 512 bytes is the SMC header; it's mostly null bytes). This ROM file is also not padded correctly, so you must force snesrc to read the ROM with the LoROM algorithm, using the -l command line argument.

$ ./snesrc -l -fbrk 2mhz.smc 2mhz

This will create a new subdirectory called 2mhz/ where you will find the disassembled bank files (bankXX.asm) and the disassembler pass logs: pass1.log, pass3.log. Since Pass 2 is specific to flushing map bytes, it does not currently log anything.

==Case Study==

This experiment shows some promise that it is possible to reproduce valid (although not always complete) assembler source code out of simple SNES ROMs. In my tests, I've seen the most success with small "public domain" SNES ROMs. (Technically, these are homebrew ROMs, and are subject to the author's copyright as applicable.)

It would be better to replace the recursive tree with a [https://en.wikipedia.org/wiki/FIFO FIFO] system. Thus, any time a conditional branch or subroutine call is found, the destination address and CPU status are placed on top of the FIFO. Pull that information out of the FIFO for each run until the FIFO is empty. This would make tracing much easier to debug than a maze-like recursion tree. It should also be much closer to the path taken by an assembler as it was creating the ROM.

[[Category:Developer_Documentation]]

Snesrc

2020-12-28T05:46:48Z

Parasyte: Fix links

"SNESRC" is an acronym meaning Super Nintendo Entertainment System Re-Compiler. When referring to the program, it is spelled with lowercase letters: snesrc.

Snesrc was an experiment intended to reproduce the complete assembler source code for SNES games. Its main algorithm is a recursive half-emulation of the 65c816 CPU. The theory is that doing this, it can trace all possible program paths to create a "map" of the ROM file which can disassemble the file with proper code/data separation.

==Downloads==

Snesrc is released under the terms of the [https://www.gnu.org/licenses/old-licenses/gpl-2.0.html GNU General Public License 2.0]. The source code can be downloaded from [https://github.com/parasyte/snesrc github]. You can download the source with git using the following clone command:

git clone https://github.com/parasyte/snesrc

==The Map==

The map contains information about whether a byte is part of an instruction; is read as data by another instruction; is the destination of a branch, jump or pointer; or was fixed or not marked 'clean'. The map also contains some information about the state of the CPU when that instruction was encountered.

==The Algorithm==

Because of the recursive nature of the algorithm, it will never trace over a byte twice. This aspect also gives it some severe design flaws. The biggest one being that the CPU status attempts to evolve throughout the recursive tree. A branch- or jump-instruction creates a new branch (fork) within the recursive tree; when that branch ends, the algorithm backs out to the last fork and continues running.

The three conditions which can end a branch are: interrupt returns, hitting an instruction which was already traced, or hitting an instruction which appears to be invalid. In the latter case, the algorithm attempts to make an adjustment to the CPU state and re-run over the last branch of instructions (which are still marked 'dirty'). If the branch then ends on one of the other two conditions, the map is marked 'clean' (all dirty bits are removed) and the algorithm continues from the last fork.

==Problem Areas==

The biggest concern with the 65c816 is that its instruction sizes can change dynamically as the program runs (similar to the ARM line of CPUs). This makes it difficult to disassemble the program; let alone to separate code from data. Therefore, it is possible for a string of instructions to be interpreted differently depending on the CPU state when the string is executed. The CPU can have four possible states which can change instruction sizes. These states depend upon the three CPU status bits which change the size of the accumulator and index registers: ''I'', ''M'', and ''E'':

* When ''I'' is set, the index registers X and Y are 8-bit. When clear, X and Y are 16-bit.
* When ''M'' is set, the accumulator register is 8-bit. When clear, the accumulator is 16-bit.
* When ''E'' is set, both the index registers X and Y, and accumulator register are 8-bit. When clear, X, Y and accumulator sizes depend on the states of ''I'' and ''M'', respectively.

Another common problem area encountered is with jump tables: seen with JMP ($xxxx,X) and JSR ($xxxx,X) instructions, where $xxxx is often a ROM address listing several pointers. Some of these tables can be found and recursed into by snesrc, but not in all cases.

When $xxxx is a RAM address, we have no idea what it's trying to jump to; this is also a problem for JMP (indirect) and JMP [indirect long] instructions. (See [[#Known Bugs|Known Bugs]].) This is common when a function pointer is sent to a subroutine as an argument, or when a pointer needs to be modified in some way. These corner cases are difficult to anticipate and handle correctly. Especially with this particular algorithm.

The recursion itself is problematic; RTS/RTL instructions do not back out to the last fork, but instead try to pull the return address off the stack. This leaves the CPU state as it was when it left the subroutine (technically correct). But we still run into strange cases where the emulator will throw a warning or an error (usually an invalid instruction). This seems to be caused by incorrectly following the program path, but expecting the CPU status to be correct in all cases.

===Known Bugs===

* JMP (indirect) and JMP [indirect long] instructions are not supported.
* The ''-fcop'', ''-fstp'', and ''-fwdm'' command line arguments are ignored.
* Needs a command line argument to force the header size (to 0 or 512).

==Usage==

$ ./snesrc
snesrc - The SNES Recompiler v0.01
Copyright 2005 Parasyte

Usage:
snesrc [options] <input file> <output dir>

Options:
-l: Force LoROM
-h: Force HiROM
-r<n>: Set pointer table range check size
-fbrk: Attempt to fix code which reaches a BRK instruction
-fcop: Attempt to fix code which reaches a COP instruction
-fstp: Attempt to fix code which reaches a STP instruction
-fwdm: Attempt to fix code which reaches a WDM instruction

I used the [http://www.pouet.net/prod.php?which=15713 2.68 MHz Demo by Abandon] for most of the testing. It's not possible to force the header size, so you will want to chop off the first 512 bytes of the .smc file using a hex editor. (This 512 bytes is the SMC header; it's mostly null bytes). This ROM file is also not padded correctly, so you must force snesrc to read the ROM with the LoROM algorithm, using the -l command line argument.

$ ./snesrc -l -fbrk 2mhz.smc 2mhz

This will create a new subdirectory called 2mhz/ where you will find the disassembled bank files (bankXX.asm) and the disassembler pass logs: pass1.log, pass3.log. Since Pass 2 is specific to flushing map bytes, it does not currently log anything.

==Case Study==

This experiment shows some promise that it is possible to reproduce valid (although not always complete) assembler source code out of simple SNES ROMs. In my tests, I've seen the most success with small "public domain" SNES ROMs. (Technically, these are homebrew ROMs, and are subject to the author's copyright as applicable.)

It would be better to replace the recursive tree with a [http://en.wikipedia.org/wiki/FIFO FIFO] system. Thus, any time a conditional branch or subroutine call is found, the destination address and CPU status are placed on top of the FIFO. Pull that information out of the FIFO for each run until the FIFO is empty. This would make tracing much easier to debug than a maze-like recursion tree. It should also be much closer to the path taken by an assembler as it was creating the ROM.

[[Category:Developer_Documentation]]

Scalable Remote Debugger Protocol

2020-12-28T05:23:24Z

Parasyte: Update protocol to HTTPS

This page is currently serving as a reference to kick-start development of the universal debugger protocol which will be used by the [[Universal Debugger Project]] and hopefully many, many other debuggers and debugger interfaces in the years to come.

==References==

These references are listed in order of relevance; most relevant first.

# [https://tools.ietf.org/html/rfc909 RFC-909: Loader Debugger Protocol]
# [https://sourceware.org/gdb/onlinedocs/gdb/Remote-Protocol.html GDB Remote Serial Protocol]
# [https://tools.ietf.org/html/rfc643 RFC-643: Network Debugging Protocol]
# [http://networksorcery.com/enp/ien/ien158.txt IEN-158: XNET Debugging Protocol]
# [https://xdebug.org/docs/dbgp DBGp: A common debugger protocol for languages and debugger UI communication]

The relevancy I've determined for this list is due to interest in these specs, as well as potential generic uses and protocol extension.

RFC-909 is so far the closest thing I have found which resembles the general idea I have for a "Universal Debugger Protocol". It's composed as a simple binary packet, it's extensible, and it's designed to be stacked on top of existing transport protocols such as TCP/IP. I doubt this exact spec will fit all of our needs, but it is certainly a good start.

GDB provides a fairly popular protocol. This one is designed for serial communications, so it will work well with small embedded devices. But it could be complicated to extend while retaining its GDB friendliness.

RFC-643 and IEN-158 are interesting only because they show that some experimentation on the ideas of remote debugging have been employed in the past. Unfortunately, these specs were designed for a specific architecture, and are of little practical use for our purposes.

DBGp shows what a modern remote debugging protocol can look like; including modern XML syntax. The downside to this is that low-level debuggers in small embedded devices are unlikely to parse XML at all.

==Ideas==

This section represents my ([[User:Parasyte|Parasyte]]) own personal opinions and ideas, and should not be taken as advocacy for standardization.

One of the main goals of developing a "universal" protocol for debugging is that it must be usable everywhere; in small embedded devices, and some of the most powerful machines in the world. This kind of flexibility must be designed around multiple layers of abstraction. See [https://en.wikipedia.org/wiki/OSI_Model OSI Model] and [https://en.wikipedia.org/wiki/Internet_Protocol_Suite Internet Protocol Suite] for examples of abstraction layers used in communications technologies.

At the lowest layer, you find the ''wire''; the physical means of transmitting information over distance. For our purposes, we should not limit ourselves to a single wire. Instead, we should allow the use of multiple wires, user-selectable, but never more than one at a time.

The following layers get more and more generic and abstract, until you reach the highest layer which represents what the application sees and interacts with. This would be the "protocol" itself.

So let's break these components down, hypothetically, and get into some details, ordered lowest layer first:

# '''Physical layer''': Some examples of wires to support include LAN (Ethernet/WiFi), Wireless (Bluetooth), RS-232 (serial port, USB serial port), Inter-Process Communication (Domain Sockets? DBUS?)
# '''Transport layer''': Some examples of transport protocols include TCP/IP, UDP/IP (LAN, Domain Sockets), UART (RS-232), IPC-specific (DBUS)
# '''Application layer''': A library (or similar service, E.G. a daemon) to tie all transport layers into a single API that, to the application, looks like one simple interface to connect and send/receive data. The library/daemon will have to handle the transport-specific details behind-the-scenes.

Thinking about this led to a conundrum; If we support multiple wires, we have to support multiple transport protocols which are compatible with those wires. And if we support multiple transport protocols, we have to know which one our target implements. To make the API as simple as possible, we must not force clients to choose from configurable options (for a bad example) that requires a large degree of changes for each different type of connection made. How do we simplify the API so that a user can just plain connect without doing any pre-setup work?

Answer: The [https://en.wikipedia.org/wiki/URI_scheme URI scheme]. The unfortunate downside to this solution is that it is undesired to use URI schemes without registering them with IANA. However, an argument could be made that these schemes would not be used for general network/internet communication. A few popular examples of similarly non-networked schemes are the file: and about: URI schemes. (The exception here is that at least one physical layer (LAN) could be used for over-the-internet communication; but this has great benefits in its own right.)

===Example URI Schemes===

The following table represents some examples of how URI schemes could be used as debugger protocols:

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | <nowiki>srdp://192.168.1.20/</nowiki>
| style="border: 1px solid #000000;" | TCP/IP to remote host 192.168.1.20 on a pre-defined default port
|-
| style="border: 1px solid #000000;" | <nowiki>srdp+udp://192.168.1.20:9424/</nowiki>
| style="border: 1px solid #000000;" | UDP/IP to remote host 192.168.1.20 on port 9424
|-
| style="border: 1px solid #000000;" | <nowiki>srdp+usb://localhost/</nowiki>
| style="border: 1px solid #000000;" | USB (SRDP-compatible devices) on localhost
|-
| style="border: 1px solid #000000;" | <nowiki>srdp+uart://localhost:3/</nowiki>
| style="border: 1px solid #000000;" | UART COM port 3 on localhost
|-
| style="border: 1px solid #000000;" | <nowiki>srdp+dbus://localhost/</nowiki>
| style="border: 1px solid #000000;" | DBUS IPC on localhost
|}

The 'srdp' prefix on these examples is to specify the 'Scalable Remote Debugger Protocol.' The + and suffix defines an additional layer (or protocol) below SRDP.

The latter three examples look a bit odd with localhost being the destination, but this is necessary, since the localhost ''is'' the destination for hosting the UART RS-232 port, USB port, and IPC interface. Using non-loopback interfaces (IP addresses outside of the local machine) with these protocols should be undefined, unless there is evidence that connecting to RS-232/USB/IPC interfaces on other machines across a network is practical and plausible.

===Simplified Configuration===

These URI schemes give a very simple and elegant solution to the concerns they address. No longer will you be stuck with complicated configuration settings like the example below (upper left group box) ... and this is not an incredibly complex configuration dialog, as it is; instead, connecting to ANY low-level debugger in the world will be as simple as typing a URL.

Example of what '''''not''''' to do:

[[Image:Gscc_config.png]]

===Operation Groups===

The protocol is defined as a set of usable "requests" (AKA "operations" or "commands") requested by the client to the debugger, or vice-versa. Operations should be grouped according to a specific metric. The metric I've chosen is hardware (architecture) relationships. The table below shows an example of such groups (currently 6 in total) and example operations assigned to each group.

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | 1)
| style="border: 1px solid #000000;" | Diagnostics (Info, Ping/Pong, Reset, ...)
|-
| style="border: 1px solid #000000;" | 2)
| style="border: 1px solid #000000;" | CPU handling (Register read/write, Arbitrary code execution, General CPU control, General process/thread control...)
|-
| style="border: 1px solid #000000;" | 3)
| style="border: 1px solid #000000;" | Memory handling (Read, Write, Address conversion, Hardware I/O, Cache control, ...)
|-
| style="border: 1px solid #000000;" | 4)
| style="border: 1px solid #000000;" | Breakpoint handling (Add, Delete, Edit, Get, ...)
|-
| style="border: 1px solid #000000;" | 5)
| style="border: 1px solid #000000;" | Stream handling (stdin/stdout/stderr, Debugger-specific messages, ...)
|-
| style="border: 1px solid #000000;" | 6)
| style="border: 1px solid #000000;" | Vendor-specific (Custom command sets; should be discouraged unless absolutely necessary)
|}

==Proposal==

This section defines a proposed specification which may be adopted as the "Scalable Remote Debugger Protocol". It is considered a work in progress and is currently open for peer-review, meaning we are interested in receiving comments, criticisms, and suggestions.

===Protocol Goals===

Goals of the protocol include:

# '''Client/server relationship''': Target (debuggee) acts as a server, quietly listening for any SRDP requests; User Interface acts as a client, making explicit requests to a listening server.
# '''Asynchronous requests''': A client must send requests without expecting an immediate response. A server accepting requests may not respond immediately to those requests.
# '''Scalable''': The data structure (format) used in the protocol must be adaptable to the future; The structure must be as forgiving and dynamic as possible, avoiding fixed contents (except where absolutely necessary) and allowing for [non-mission-critical] non-standard contents.
# '''Easy to implement''': Basic features of the protocol should be easy to implement from an API point-of-view, as well as having a small memory footprint; the protocol must be usable on small embedded machines with few resources.
# '''Robust''': Ambiguity should be kept to a minimum in all aspects of the protocol; every bit transferred should have a useful meaning.
# '''Easy to debug''': A debugger protocol that cannot itself be debugged (observed and verified to work as expected) is a failure in and of itself. For this reason, the protocol should be human-readable in its most basic form.

===Underlying Protocols===

There are no reservations on any underlying protocols (protocols used to move data from the client to the server, and back again -- SRDP is not one of these protocols). The only requirement is that they provide hand-shaking (transmission control), sequential ordering of packet data arrival, and data integrity checking. Some examples of suitable underlying protocols include [https://en.wikipedia.org/wiki/Internet_Protocol_Suite TCP/UDP/IP], and [https://en.wikipedia.org/wiki/Universal_asynchronous_receiver/transmitter UART].

The initial reference implementation will use TCP/IP for remote connections. For local-listening servers, the reference implementation will use UNIX Domain Sockets on UNIX-like operating systems, and Named Pipes on Windows.

===Requests, Responses, Alerts===

Packets are given different names depending on their transmission direction (client -> server, or server -> client) and intended recipient (server, specific client, or all clients).

Response, requests, and alerts must have a unique identifier associated with them. This will allow clients and servers to stay in sync, knowing which responses are for which requests, for example.

====Requests====

A packet is called a request if it is from a client to the server. The name "request" comes from the idea that the client is requesting information or a specific action to be performed by the server.

====Responses====

Responses are packets from a server to a specific client. Responses are always sent in response to a request (hence the name). However, not all requests are required to send responses (which is why "requests" are not called "commands"). Responses are only sent to the client which initiated the specific request being handled.

====Alerts====

An alert is a special type of response (a packet from the server to clients); an alert is sent to all connected/listening clients. This is synonymous with network "broadcast" packets, and it useful for notifying all clients of information they might like to know.

A few examples of information that all clients might like to know are:

* Breakpoint hits
* Pausing/resuming execution
* Resets
* Debugging messages (log messages)

Not all alerts are initiated by requests from clients, but most will be. Log messages are typically spewed by programs without explicit requests; SRDP can allow listening for and capturing these messages.

====Commands====

The all-encompassing term for requests, responses and alerts is "commands". Any time a "command" is mentioned, it refers to any combination of requests, responses, or alerts.

===Protocol Packet Data Structure===

The "goals" section outlines the major features which formed the following data structure. Inspiration comes mainly from [https://en.wikipedia.org/wiki/JSON JSON], the JavaScript Object Notation. As JSON is a serialization [text format] of JavaScript objects, the SRDP data structure is a serialization of the data being transmitted.

The structure also shares some inspiration from [https://en.wikipedia.org/wiki/Remote_procedure_call RPC]; An example is that your client may want to read work RAM from the target. The SRDP request for "read memory" is technically similar to remotely running a "read memory" function on the server, invoked by the client. For this reason, each SRDP packet contains one or more "arguments" which you could imagine are passed directly to a plain old function.

Each packet is sent as a series of 8-bit bytes. Packets are broken down into a "request/response/alert" name (called a command), encapsulating a series of "arguments". You can think of it like a C function call. The "info" command, for example, requests information about the target machine architecture; it requires no arguments. The info command looks like this, and is a complete and valid SRDP packet:

info()

Each argument has a name (made of one or more characters: alpha-numeric, underscore (_), or hyphen (-). The argument name is followed by a colon (:) and then a single byte representing the data type of the argument, then an equals sign (=) and the argument's value. All arguments are separated with a comma (,).

The argument syntax is similar to that of [https://en.wikipedia.org/wiki/Cascading_Style_Sheets CSS]. In pseudo-form, it looks something like this:

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | {name}:{type}={value}
|}

Valid data types:

{| style="border-collapse: collapse; width: 100%;"
|-
! style="border: 1px solid #000000; background-color: #303030;" | {type}
! style="border: 1px solid #000000; background-color: #303030;" | Name
! style="border: 1px solid #000000; background-color: #303030;" | Description
|-
| style="border: 1px solid #000000;" | n
| style="border: 1px solid #000000;" | Number
| style="border: 1px solid #000000;" | Any positive integer, encoded as a [http://www.dlugosz.com/ZIP2/VLI.html VLI]
|-
| style="border: 1px solid #000000;" | s
| style="border: 1px solid #000000;" | Signed Number
| style="border: 1px solid #000000;" | Any negative integer, encoded as a [https://en.wikipedia.org/wiki/Signed_number_representations#Ones.27_complement one's complement] VLI
|-
| style="border: 1px solid #000000;" | f
| style="border: 1px solid #000000;" | Floating Point Number
| style="border: 1px solid #000000;" | Any non-integer number, Infinity, or [https://en.wikipedia.org/wiki/NaN NaN], encoded as a null-terminated UTF-8 string, or null-terminated UTF-16 or UTF-32 string with BOM; To be decoded by [https://en.wikipedia.org/wiki/Scanf#Format_string_specifications sscanf]
|-
| style="border: 1px solid #000000;" | a
| style="border: 1px solid #000000;" | Array
| style="border: 1px solid #000000;" | Byte-array (binary blob), preceded by a VLI to indicate the length of the array.
|-
| style="border: 1px solid #000000;" | c
| style="border: 1px solid #000000;" | Compressed Array
| style="border: 1px solid #000000;" | Byte-array (binary blob) with [https://en.wikipedia.org/wiki/Run-length_encoding RLE] compression. See [[#Compressed Array Data Type]]
|-
| style="border: 1px solid #000000;" | t
| style="border: 1px solid #000000;" | Text
| style="border: 1px solid #000000;" | Null-terminated UTF-8 string without [https://en.wikipedia.org/wiki/Byte-order_mark BOM], or null-terminated UTF-16 or UTF-32 string with BOM
|}

Some example arguments. (Please keep in mind that all of the argument names listed within this section are for demonstration purposes only, and are not recommended for reference purposes.)

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | msg:t=Hello, World!␀
|-
| style="border: 1px solid #000000;" | num:n=□
|-
| style="border: 1px solid #000000;" | pi:f=3.141592␀
|-
| style="border: 1px solid #000000;" | ram_dump:a=□■■■■
|-
| style="border: 1px solid #000000;" | my-compressed-data:c=□□■■■■□■□■■■
|}

What the symbols mean:

␀: Null-terminator
□: VLI
■: Data byte

====Compressed Array Data Type====

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000; padding: 10px;" | '''''One suggestion is making this data-type optional, and changing this spec to use a standard library, like zlib. In this way, slow machines with few resources can adhere to the SRDP spec without wasting precious footprint space and computational power implementing such "heavy" libraries.

Using a "heavy-weight" compression library will help achieve better compression ratios, but will add processing overhead. The added overhead may turn out to be an overall hindrance. For example, a memory editor might want to refresh a memory display as fast as possible (several times per second) within a small visible view-port window. This kind of editor will need to download the contents of memory within the visible range quickly. Using compression in a use-case like this is a good idea, but only if transferring the smaller packet can make up for the time required to compress and decompress the data.'''''
|}

Compressed arrays are similar to the standard "Array" data type. The compressed array starts with a single VLI to represent the total data size of the value (e.g. the size of the compressed data, ''x''). The following data is a series of alternating raw data and RLE data.

# '''Raw Data''': A VLI representing raw data size in bytes (''n''), followed by ''n'' bytes of actual data.
# '''RLE Data''': A VLI representing RLE data size in bytes (''n''), followed by a single byte to be repeated (''n'' + 4) times in the output.

This series is repeated until there are no more bytes to be read from input (''x'' bytes of the argument value have been read).

For the RLE compression to be useful (efficient) it must not be used for any less than 4 bytes (therefore, the VLI is said to be a "4-based" number). The number 4 is derived from the minimum overhead introduced by the serial alternation and VLIs; 1 VLI for RLE output length, 1 byte of RLE data, 1 VLI for raw data length.

Thus, in order for the RLE to perform "compression", the RLE output must be larger than the smallest sequence required to switch from raw data and back again. Some examples to illustrate, '''''bold-italic''''' bytes are for VLI "control sequences" (the data lengths specified above):

'''EXAMPLE 1'''

Non-compressed data:
94 24 51 73 00 00 00 01

Incorrectly compressed data:
'''''04''''' 94 24 51 73 '''''03''''' 00 '''''01''''' 01

Correctly compressed data:
'''''08''''' 94 24 51 73 00 00 00 01

'''EXAMPLE 2'''

Non-compressed data:
94 24 51 73 00 00 00 00

Correctly compressed data:
'''''04''''' 94 24 51 73 '''''00''''' 00

'''EXAMPLE 3'''

Non-compressed data:
00 00 00 00 00 00 00 00

Correctly compressed data:
'''''00''''' '''''04''''' 00

The reason the second line in example 1 above is "incorrectly" compressed is because the second VLI is expecting the length to be 0-based. If this was the case, you would be simply adding overhead to replace any bytes saved by the "compression". For this reason, the "correct" way to compress the example is to use a single length of raw data. This example is non-compressible, and should not be sent as a compressed array data type.

In the second example, the data can be compressed nicely, saving a byte overall (including compression overhead). Since this is the "correct" way to compress the data, it is using a 4-based VLI on the RLE Data: "'''''00'''''" means 4 bytes of output, "'''''01'''''" means 5 bytes, "'''''02'''''" means 6 bytes, etc.

The third example shows how to compress a series of bytes that starts with repeating data, instead of non-repeating "raw data". The first VLI of '''''00''''' means there is no raw data for output (this is required: compressed arrays always begin with the Raw Data, followed by RLE Data). The second VLI '''''04''''' is the length of the RLE Data; 8 bytes. Even if the non-compressed data was 4 bytes of "00", the compressed array would still be only 3 bytes of total data, saving one byte. This helps explain the reasoning behind the 4-based VLI for RLE Data.

====Object Data Type====

The basic structure and data types shown so far are very powerful; your server can tell the client vast amounts of information, such as CPU architecture, memory maps, I/O maps, etc. with just a handful of arguments. However, grouping these arguments in a meaningful way may be difficult. You might be inclined to do some "mock" namespacing, like prefixing each CPU-related argument with "cpu_". This is effective, but also slightly wasteful and error-prone.

The "object" data type is designed to handle such situations. This data type allows an argument to be a container for other arguments. Its format looks like this:

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | {name}:[...]
|}

Where the set of ellipses denotes one or more "regular" arguments. Here is an example of what a "cpu" object might look like:

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | <tt>cpu:[
:arch:t=ARM␀,
:name:t=ARM946E-S␀
]</tt>
|}
(Note the white-space is for readability only; it is not meant to be transferred as part of the protocol data.)

In this example, the "cpu" object contains two arguments: cpu.arch and cpu.name; both strings.

But there are also times when you will want your server to send that same information for two [or more] CPU architectures on a single target. Some platforms may have multiple CPUs, each with its own individual set of resources (memory maps and the like), as well as shared resources between the CPUs. For this, the packet data structure needs a more advanced method of communicating these kinds of details.

For this case, you can optionally create arrays of objects by including a comma (,) after the closing square bracket, followed by another series of arguments enclosed in their own square brackets, ''ad infinitum''. In other words, any leading object definitions without explicit names will be treated as additional array elements for the previously named object.

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | <tt>cpu:[
:arch:t=ARM␀,
:name:t=ARM946E-S␀
], 
[
:arch:t=ARM␀,
:name:t=ARM7TDMI␀
]</tt>
|}

Now our "cpu" object defines two CPUs: an ARM9 and ARM7, ready for Nintendo DS hacking. These arguments can be referenced as cpu[0].arch, cpu[0].name, cpu[1].arch, and cpu[1].name respectively.

Objects can be arbitrarily complex, containing arrays of other objects. Here is an example of a simple memory map, containing Work RAM (read/write/execute) and ROM (read/execute) sections. The Work RAM section is broken into two distinct memory ranges. This is quite easily expressed:

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | <tt>memory:[
:name:t=Work RAM␀,
:range:[
::start:n=□,
::end:n=□
:], 
:[
::start:n=□,
::end:n=□
:],
:flags:t=RWX␀
], 
[
:name:t=ROM␀,
:range:[
::start:n=□,
::end:n=□
:],
:flags:t=RX␀
]</tt>
|}

===Command Reference===

'''''FIXME: WIP'''''

====Requests====

=====info()=====
Request information about the target machine architecture. The '''info()''' request is implied during connection: A response will be sent after a connection is successfully established. This allows clients to gather necessary information about the target during the connection phase, without explicitly making the request.

''Return values'':

Must only return one of the following:
* '''sys''' (string): A shorthand for defining known machines (mostly video game systems). See below for supported values
* '''machine''' (object): A longhand for defining the complete machine architecture. TODO: Specify all object elements and their meanings.

The currently supported values for '''sys''' are:
* '''NES''': Nintendo Entertainment System
* '''SNES''': Super Nintendo Entertainment System
* '''N64''': Nintendo 64
* '''NGC''': Nintendo GameCube

* '''PS1''': Sony Playstation
* '''PS2''': Sony Playstation 2

Planned future support:
* '''GB''': Nintendo GameBoy
* '''GBC''': Nintendo GameBoy Color
* '''GBA''': Nintendo GameBoy Advance
* '''NDS''': Nintendo DS/DSi
* '''VB''': Nintendo Virtual Boy
* '''WII''': Nintendo Wii

* '''SMS''': Sega Master System
* '''SMD''': Sega Genesis/Mega Drive
* '''SCD''': Sega CD/Mega CD
* '''32X''': Sega 32X
* '''GG''': Sega GameGear
* '''SAT''': Sega Saturn
* '''DC''': Sega Dreamcast

* '''PSP''': Sony Playstation Portable

* '''TGC''': Tiger Game.com

Considering support "some day":
* '''3DO''': 3DO
* '''A26''': Atari 2600
* '''LYNX''': Atari Lynx/Lynx II
* '''JAG''': Atari Jaguar
* '''WS''': Bandai Wonder Swan
* '''WSC''': Bandai Wonder Swan Color
* '''GP32''': GamePark 32
* '''GP2X''': GamePark Holdings GP2X
* '''WIZ''': GamePark Holdings GP2X Wiz
* '''XBOX''': Microsoft XBox
* '''X360''': Microsoft XBox 360
* '''TG16''': NEC TurboGrafx-16/PC Engine
* '''TGCD''': NEC TurboGrafx-CD/PC Engine CD
* '''SGFX''': NEC SuperGrafx
* '''NGEO''': Neo*Geo AES/MVS
* '''NGP''': Neo*Geo Pocket
* '''NGPC''': Neo*Geo Pocket Color
* '''PKMN''': Nintendo Pokémon Mini
* '''SGB''': Nintendo Super GameBoy
* '''CDI''': Philips CD-i
* '''VMU''': Sega Dreamcast VMU
* '''PS3''': Sony Playstation 3
* '''SV''': SuperVision

===Proposal Comments===

The "Protocol Goals" have lead much of the motivation for developing this proposal. This presents what seems to be several very strange choices, at first glance. The choice of representing floating point numbers as a string of text seems extremely odd, until you consider that using the target-native floating point data format would make no sense when sending that data to a remote client (which is most likely running on a different architecture, and may have totally different native floating point formats). In order to express floating points with arbitrary precision in a data format-agnostic way, it is necessary to use a non-native format like text.

Another oddity in this spec is the use of VLIs (variable-length integers) and their affect on the rest of the format. The main purpose for using VLIs is address widths. Some architectures can express their full address range within a single byte. Others require up to 8 bytes for a full 64-bit address range. Future architectures are by no means limited to 64-bit address widths. For this very reason, it is necessary to scale down as well as up. A VLI can express an address in as little as a single byte, or scale upward to arbitrarily large numbers. This makes VLIs perfect for addressing requirements among any architecture.

VLIs present their own issues, however. For example, expressing a negative number as a VLI is nearly incomprehensible. Some might be inclined to reserve one bit within a VLI to indicate signedness, but that's another bit that cannot be used to minimize VLI overhead. The overhead is additional bytes required to represent a full number in a VLI system. For example, it is common for numbers 0 - 127 to be contained entirely within a single byte, including the overhead. But numbers between 128 - 255 require an additional byte to include more VLI "header" information (used to extend VLIs into arbitrarily long numbers). This is counter-intuitive, where a single byte itself can hold numbers between 0 - 255. Adding an additional sign bit reduces the range of VLIs by half: a single byte can only encode numbers between 0 - 63.

The solution is to use a different data type specifically for expressing negative numbers. The VLI is encoded just like a positive number, but when interpreting the VLI, it must be converted to a negative number by either subtracting from zero (0 - ''n'') or multiplying by negative one (''n'' * -1). This is referred to as a "one's complement".

In general, the efficiency of a VLI is very static. That means, a number using 0 - 7 bits of data (for example, the number "0" uses 0 bits of data, and the number "64" [binary: 100000] uses 6 bits) can be encoded into a single byte, a number using 8 - 14 bits can be encoded into 2 bytes, a number using 15 - 21 bits can be encoded into 3 bytes, etc. See http://www.dlugosz.com/ZIP2/VLI.html for more information on the kind of VLI I am considering for this proposal.

[[Category:Developer_Documentation]]

Scalable Remote Debugger Protocol

2020-12-28T05:19:54Z

Parasyte: Fix external links

This page is currently serving as a reference to kick-start development of the universal debugger protocol which will be used by the [[Universal Debugger Project]] and hopefully many, many other debuggers and debugger interfaces in the years to come.

==References==

These references are listed in order of relevance; most relevant first.

# [https://tools.ietf.org/html/rfc909 RFC-909: Loader Debugger Protocol]
# [https://sourceware.org/gdb/onlinedocs/gdb/Remote-Protocol.html GDB Remote Serial Protocol]
# [https://tools.ietf.org/html/rfc643 RFC-643: Network Debugging Protocol]
# [http://networksorcery.com/enp/ien/ien158.txt IEN-158: XNET Debugging Protocol]
# [https://xdebug.org/docs/dbgp DBGp: A common debugger protocol for languages and debugger UI communication]

The relevancy I've determined for this list is due to interest in these specs, as well as potential generic uses and protocol extension.

RFC-909 is so far the closest thing I have found which resembles the general idea I have for a "Universal Debugger Protocol". It's composed as a simple binary packet, it's extensible, and it's designed to be stacked on top of existing transport protocols such as TCP/IP. I doubt this exact spec will fit all of our needs, but it is certainly a good start.

GDB provides a fairly popular protocol. This one is designed for serial communications, so it will work well with small embedded devices. But it could be complicated to extend while retaining its GDB friendliness.

RFC-643 and IEN-158 are interesting only because they show that some experimentation on the ideas of remote debugging have been employed in the past. Unfortunately, these specs were designed for a specific architecture, and are of little practical use for our purposes.

DBGp shows what a modern remote debugging protocol can look like; including modern XML syntax. The downside to this is that low-level debuggers in small embedded devices are unlikely to parse XML at all.

==Ideas==

This section represents my ([[User:Parasyte|Parasyte]]) own personal opinions and ideas, and should not be taken as advocacy for standardization.

One of the main goals of developing a "universal" protocol for debugging is that it must be usable everywhere; in small embedded devices, and some of the most powerful machines in the world. This kind of flexibility must be designed around multiple layers of abstraction. See [http://en.wikipedia.org/wiki/OSI_Model OSI Model] and [http://en.wikipedia.org/wiki/Internet_Protocol_Suite Internet Protocol Suite] for examples of abstraction layers used in communications technologies.

At the lowest layer, you find the ''wire''; the physical means of transmitting information over distance. For our purposes, we should not limit ourselves to a single wire. Instead, we should allow the use of multiple wires, user-selectable, but never more than one at a time.

The following layers get more and more generic and abstract, until you reach the highest layer which represents what the application sees and interacts with. This would be the "protocol" itself.

So let's break these components down, hypothetically, and get into some details, ordered lowest layer first:

# '''Physical layer''': Some examples of wires to support include LAN (Ethernet/WiFi), Wireless (Bluetooth), RS-232 (serial port, USB serial port), Inter-Process Communication (Domain Sockets? DBUS?)
# '''Transport layer''': Some examples of transport protocols include TCP/IP, UDP/IP (LAN, Domain Sockets), UART (RS-232), IPC-specific (DBUS)
# '''Application layer''': A library (or similar service, E.G. a daemon) to tie all transport layers into a single API that, to the application, looks like one simple interface to connect and send/receive data. The library/daemon will have to handle the transport-specific details behind-the-scenes.

Thinking about this led to a conundrum; If we support multiple wires, we have to support multiple transport protocols which are compatible with those wires. And if we support multiple transport protocols, we have to know which one our target implements. To make the API as simple as possible, we must not force clients to choose from configurable options (for a bad example) that requires a large degree of changes for each different type of connection made. How do we simplify the API so that a user can just plain connect without doing any pre-setup work?

Answer: The [http://en.wikipedia.org/wiki/URI_scheme URI scheme]. The unfortunate downside to this solution is that it is undesired to use URI schemes without registering them with IANA. However, an argument could be made that these schemes would not be used for general network/internet communication. A few popular examples of similarly non-networked schemes are the file: and about: URI schemes. (The exception here is that at least one physical layer (LAN) could be used for over-the-internet communication; but this has great benefits in its own right.)

===Example URI Schemes===

The following table represents some examples of how URI schemes could be used as debugger protocols:

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | <nowiki>srdp://192.168.1.20/</nowiki>
| style="border: 1px solid #000000;" | TCP/IP to remote host 192.168.1.20 on a pre-defined default port
|-
| style="border: 1px solid #000000;" | <nowiki>srdp+udp://192.168.1.20:9424/</nowiki>
| style="border: 1px solid #000000;" | UDP/IP to remote host 192.168.1.20 on port 9424
|-
| style="border: 1px solid #000000;" | <nowiki>srdp+usb://localhost/</nowiki>
| style="border: 1px solid #000000;" | USB (SRDP-compatible devices) on localhost
|-
| style="border: 1px solid #000000;" | <nowiki>srdp+uart://localhost:3/</nowiki>
| style="border: 1px solid #000000;" | UART COM port 3 on localhost
|-
| style="border: 1px solid #000000;" | <nowiki>srdp+dbus://localhost/</nowiki>
| style="border: 1px solid #000000;" | DBUS IPC on localhost
|}

The 'srdp' prefix on these examples is to specify the 'Scalable Remote Debugger Protocol.' The + and suffix defines an additional layer (or protocol) below SRDP.

The latter three examples look a bit odd with localhost being the destination, but this is necessary, since the localhost ''is'' the destination for hosting the UART RS-232 port, USB port, and IPC interface. Using non-loopback interfaces (IP addresses outside of the local machine) with these protocols should be undefined, unless there is evidence that connecting to RS-232/USB/IPC interfaces on other machines across a network is practical and plausible.

===Simplified Configuration===

These URI schemes give a very simple and elegant solution to the concerns they address. No longer will you be stuck with complicated configuration settings like the example below (upper left group box) ... and this is not an incredibly complex configuration dialog, as it is; instead, connecting to ANY low-level debugger in the world will be as simple as typing a URL.

Example of what '''''not''''' to do:

[[Image:Gscc_config.png]]

===Operation Groups===

The protocol is defined as a set of usable "requests" (AKA "operations" or "commands") requested by the client to the debugger, or vice-versa. Operations should be grouped according to a specific metric. The metric I've chosen is hardware (architecture) relationships. The table below shows an example of such groups (currently 6 in total) and example operations assigned to each group.

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | 1)
| style="border: 1px solid #000000;" | Diagnostics (Info, Ping/Pong, Reset, ...)
|-
| style="border: 1px solid #000000;" | 2)
| style="border: 1px solid #000000;" | CPU handling (Register read/write, Arbitrary code execution, General CPU control, General process/thread control...)
|-
| style="border: 1px solid #000000;" | 3)
| style="border: 1px solid #000000;" | Memory handling (Read, Write, Address conversion, Hardware I/O, Cache control, ...)
|-
| style="border: 1px solid #000000;" | 4)
| style="border: 1px solid #000000;" | Breakpoint handling (Add, Delete, Edit, Get, ...)
|-
| style="border: 1px solid #000000;" | 5)
| style="border: 1px solid #000000;" | Stream handling (stdin/stdout/stderr, Debugger-specific messages, ...)
|-
| style="border: 1px solid #000000;" | 6)
| style="border: 1px solid #000000;" | Vendor-specific (Custom command sets; should be discouraged unless absolutely necessary)
|}

==Proposal==

This section defines a proposed specification which may be adopted as the "Scalable Remote Debugger Protocol". It is considered a work in progress and is currently open for peer-review, meaning we are interested in receiving comments, criticisms, and suggestions.

===Protocol Goals===

Goals of the protocol include:

# '''Client/server relationship''': Target (debuggee) acts as a server, quietly listening for any SRDP requests; User Interface acts as a client, making explicit requests to a listening server.
# '''Asynchronous requests''': A client must send requests without expecting an immediate response. A server accepting requests may not respond immediately to those requests.
# '''Scalable''': The data structure (format) used in the protocol must be adaptable to the future; The structure must be as forgiving and dynamic as possible, avoiding fixed contents (except where absolutely necessary) and allowing for [non-mission-critical] non-standard contents.
# '''Easy to implement''': Basic features of the protocol should be easy to implement from an API point-of-view, as well as having a small memory footprint; the protocol must be usable on small embedded machines with few resources.
# '''Robust''': Ambiguity should be kept to a minimum in all aspects of the protocol; every bit transferred should have a useful meaning.
# '''Easy to debug''': A debugger protocol that cannot itself be debugged (observed and verified to work as expected) is a failure in and of itself. For this reason, the protocol should be human-readable in its most basic form.

===Underlying Protocols===

There are no reservations on any underlying protocols (protocols used to move data from the client to the server, and back again -- SRDP is not one of these protocols). The only requirement is that they provide hand-shaking (transmission control), sequential ordering of packet data arrival, and data integrity checking. Some examples of suitable underlying protocols include [http://en.wikipedia.org/wiki/Internet_Protocol_Suite TCP/UDP/IP], and [http://en.wikipedia.org/wiki/Universal_asynchronous_receiver/transmitter UART].

The initial reference implementation will use TCP/IP for remote connections. For local-listening servers, the reference implementation will use UNIX Domain Sockets on UNIX-like operating systems, and Named Pipes on Windows.

===Requests, Responses, Alerts===

Packets are given different names depending on their transmission direction (client -> server, or server -> client) and intended recipient (server, specific client, or all clients).

Response, requests, and alerts must have a unique identifier associated with them. This will allow clients and servers to stay in sync, knowing which responses are for which requests, for example.

====Requests====

A packet is called a request if it is from a client to the server. The name "request" comes from the idea that the client is requesting information or a specific action to be performed by the server.

====Responses====

Responses are packets from a server to a specific client. Responses are always sent in response to a request (hence the name). However, not all requests are required to send responses (which is why "requests" are not called "commands"). Responses are only sent to the client which initiated the specific request being handled.

====Alerts====

An alert is a special type of response (a packet from the server to clients); an alert is sent to all connected/listening clients. This is synonymous with network "broadcast" packets, and it useful for notifying all clients of information they might like to know.

A few examples of information that all clients might like to know are:

* Breakpoint hits
* Pausing/resuming execution
* Resets
* Debugging messages (log messages)

Not all alerts are initiated by requests from clients, but most will be. Log messages are typically spewed by programs without explicit requests; SRDP can allow listening for and capturing these messages.

====Commands====

The all-encompassing term for requests, responses and alerts is "commands". Any time a "command" is mentioned, it refers to any combination of requests, responses, or alerts.

===Protocol Packet Data Structure===

The "goals" section outlines the major features which formed the following data structure. Inspiration comes mainly from [http://en.wikipedia.org/wiki/JSON JSON], the JavaScript Object Notation. As JSON is a serialization [text format] of JavaScript objects, the SRDP data structure is a serialization of the data being transmitted.

The structure also shares some inspiration from [http://en.wikipedia.org/wiki/Remote_procedure_call RPC]; An example is that your client may want to read work RAM from the target. The SRDP request for "read memory" is technically similar to remotely running a "read memory" function on the server, invoked by the client. For this reason, each SRDP packet contains one or more "arguments" which you could imagine are passed directly to a plain old function.

Each packet is sent as a series of 8-bit bytes. Packets are broken down into a "request/response/alert" name (called a command), encapsulating a series of "arguments". You can think of it like a C function call. The "info" command, for example, requests information about the target machine architecture; it requires no arguments. The info command looks like this, and is a complete and valid SRDP packet:

info()

Each argument has a name (made of one or more characters: alpha-numeric, underscore (_), or hyphen (-). The argument name is followed by a colon (:) and then a single byte representing the data type of the argument, then an equals sign (=) and the argument's value. All arguments are separated with a comma (,).

The argument syntax is similar to that of [http://en.wikipedia.org/wiki/Cascading_Style_Sheets CSS]. In pseudo-form, it looks something like this:

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | {name}:{type}={value}
|}

Valid data types:

{| style="border-collapse: collapse; width: 100%;"
|-
! style="border: 1px solid #000000; background-color: #303030;" | {type}
! style="border: 1px solid #000000; background-color: #303030;" | Name
! style="border: 1px solid #000000; background-color: #303030;" | Description
|-
| style="border: 1px solid #000000;" | n
| style="border: 1px solid #000000;" | Number
| style="border: 1px solid #000000;" | Any positive integer, encoded as a [http://www.dlugosz.com/ZIP2/VLI.html VLI]
|-
| style="border: 1px solid #000000;" | s
| style="border: 1px solid #000000;" | Signed Number
| style="border: 1px solid #000000;" | Any negative integer, encoded as a [http://en.wikipedia.org/wiki/Signed_number_representations#Ones.27_complement one's complement] VLI
|-
| style="border: 1px solid #000000;" | f
| style="border: 1px solid #000000;" | Floating Point Number
| style="border: 1px solid #000000;" | Any non-integer number, Infinity, or [http://en.wikipedia.org/wiki/NaN NaN], encoded as a null-terminated UTF-8 string, or null-terminated UTF-16 or UTF-32 string with BOM; To be decoded by [http://en.wikipedia.org/wiki/Scanf#Format_string_specifications sscanf]
|-
| style="border: 1px solid #000000;" | a
| style="border: 1px solid #000000;" | Array
| style="border: 1px solid #000000;" | Byte-array (binary blob), preceded by a VLI to indicate the length of the array.
|-
| style="border: 1px solid #000000;" | c
| style="border: 1px solid #000000;" | Compressed Array
| style="border: 1px solid #000000;" | Byte-array (binary blob) with [http://en.wikipedia.org/wiki/Run-length_encoding RLE] compression. See [[#Compressed Array Data Type]]
|-
| style="border: 1px solid #000000;" | t
| style="border: 1px solid #000000;" | Text
| style="border: 1px solid #000000;" | Null-terminated UTF-8 string without [http://en.wikipedia.org/wiki/Byte-order_mark BOM], or null-terminated UTF-16 or UTF-32 string with BOM
|}

Some example arguments. (Please keep in mind that all of the argument names listed within this section are for demonstration purposes only, and are not recommended for reference purposes.)

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | msg:t=Hello, World!␀
|-
| style="border: 1px solid #000000;" | num:n=□
|-
| style="border: 1px solid #000000;" | pi:f=3.141592␀
|-
| style="border: 1px solid #000000;" | ram_dump:a=□■■■■
|-
| style="border: 1px solid #000000;" | my-compressed-data:c=□□■■■■□■□■■■
|}

What the symbols mean:

␀: Null-terminator
□: VLI
■: Data byte

====Compressed Array Data Type====

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000; padding: 10px;" | '''''One suggestion is making this data-type optional, and changing this spec to use a standard library, like zlib. In this way, slow machines with few resources can adhere to the SRDP spec without wasting precious footprint space and computational power implementing such "heavy" libraries.

Using a "heavy-weight" compression library will help achieve better compression ratios, but will add processing overhead. The added overhead may turn out to be an overall hindrance. For example, a memory editor might want to refresh a memory display as fast as possible (several times per second) within a small visible view-port window. This kind of editor will need to download the contents of memory within the visible range quickly. Using compression in a use-case like this is a good idea, but only if transferring the smaller packet can make up for the time required to compress and decompress the data.'''''
|}

Compressed arrays are similar to the standard "Array" data type. The compressed array starts with a single VLI to represent the total data size of the value (e.g. the size of the compressed data, ''x''). The following data is a series of alternating raw data and RLE data.

# '''Raw Data''': A VLI representing raw data size in bytes (''n''), followed by ''n'' bytes of actual data.
# '''RLE Data''': A VLI representing RLE data size in bytes (''n''), followed by a single byte to be repeated (''n'' + 4) times in the output.

This series is repeated until there are no more bytes to be read from input (''x'' bytes of the argument value have been read).

For the RLE compression to be useful (efficient) it must not be used for any less than 4 bytes (therefore, the VLI is said to be a "4-based" number). The number 4 is derived from the minimum overhead introduced by the serial alternation and VLIs; 1 VLI for RLE output length, 1 byte of RLE data, 1 VLI for raw data length.

Thus, in order for the RLE to perform "compression", the RLE output must be larger than the smallest sequence required to switch from raw data and back again. Some examples to illustrate, '''''bold-italic''''' bytes are for VLI "control sequences" (the data lengths specified above):

'''EXAMPLE 1'''

Non-compressed data:
94 24 51 73 00 00 00 01

Incorrectly compressed data:
'''''04''''' 94 24 51 73 '''''03''''' 00 '''''01''''' 01

Correctly compressed data:
'''''08''''' 94 24 51 73 00 00 00 01

'''EXAMPLE 2'''

Non-compressed data:
94 24 51 73 00 00 00 00

Correctly compressed data:
'''''04''''' 94 24 51 73 '''''00''''' 00

'''EXAMPLE 3'''

Non-compressed data:
00 00 00 00 00 00 00 00

Correctly compressed data:
'''''00''''' '''''04''''' 00

The reason the second line in example 1 above is "incorrectly" compressed is because the second VLI is expecting the length to be 0-based. If this was the case, you would be simply adding overhead to replace any bytes saved by the "compression". For this reason, the "correct" way to compress the example is to use a single length of raw data. This example is non-compressible, and should not be sent as a compressed array data type.

In the second example, the data can be compressed nicely, saving a byte overall (including compression overhead). Since this is the "correct" way to compress the data, it is using a 4-based VLI on the RLE Data: "'''''00'''''" means 4 bytes of output, "'''''01'''''" means 5 bytes, "'''''02'''''" means 6 bytes, etc.

The third example shows how to compress a series of bytes that starts with repeating data, instead of non-repeating "raw data". The first VLI of '''''00''''' means there is no raw data for output (this is required: compressed arrays always begin with the Raw Data, followed by RLE Data). The second VLI '''''04''''' is the length of the RLE Data; 8 bytes. Even if the non-compressed data was 4 bytes of "00", the compressed array would still be only 3 bytes of total data, saving one byte. This helps explain the reasoning behind the 4-based VLI for RLE Data.

====Object Data Type====

The basic structure and data types shown so far are very powerful; your server can tell the client vast amounts of information, such as CPU architecture, memory maps, I/O maps, etc. with just a handful of arguments. However, grouping these arguments in a meaningful way may be difficult. You might be inclined to do some "mock" namespacing, like prefixing each CPU-related argument with "cpu_". This is effective, but also slightly wasteful and error-prone.

The "object" data type is designed to handle such situations. This data type allows an argument to be a container for other arguments. Its format looks like this:

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | {name}:[...]
|}

Where the set of ellipses denotes one or more "regular" arguments. Here is an example of what a "cpu" object might look like:

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | <tt>cpu:[
:arch:t=ARM␀,
:name:t=ARM946E-S␀
]</tt>
|}
(Note the white-space is for readability only; it is not meant to be transferred as part of the protocol data.)

In this example, the "cpu" object contains two arguments: cpu.arch and cpu.name; both strings.

But there are also times when you will want your server to send that same information for two [or more] CPU architectures on a single target. Some platforms may have multiple CPUs, each with its own individual set of resources (memory maps and the like), as well as shared resources between the CPUs. For this, the packet data structure needs a more advanced method of communicating these kinds of details.

For this case, you can optionally create arrays of objects by including a comma (,) after the closing square bracket, followed by another series of arguments enclosed in their own square brackets, ''ad infinitum''. In other words, any leading object definitions without explicit names will be treated as additional array elements for the previously named object.

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | <tt>cpu:[
:arch:t=ARM␀,
:name:t=ARM946E-S␀
], 
[
:arch:t=ARM␀,
:name:t=ARM7TDMI␀
]</tt>
|}

Now our "cpu" object defines two CPUs: an ARM9 and ARM7, ready for Nintendo DS hacking. These arguments can be referenced as cpu[0].arch, cpu[0].name, cpu[1].arch, and cpu[1].name respectively.

Objects can be arbitrarily complex, containing arrays of other objects. Here is an example of a simple memory map, containing Work RAM (read/write/execute) and ROM (read/execute) sections. The Work RAM section is broken into two distinct memory ranges. This is quite easily expressed:

{| style="border-collapse: collapse; width: 100%;"
|-
| style="border: 1px solid #000000;" | <tt>memory:[
:name:t=Work RAM␀,
:range:[
::start:n=□,
::end:n=□
:], 
:[
::start:n=□,
::end:n=□
:],
:flags:t=RWX␀
], 
[
:name:t=ROM␀,
:range:[
::start:n=□,
::end:n=□
:],
:flags:t=RX␀
]</tt>
|}

===Command Reference===

'''''FIXME: WIP'''''

====Requests====

=====info()=====
Request information about the target machine architecture. The '''info()''' request is implied during connection: A response will be sent after a connection is successfully established. This allows clients to gather necessary information about the target during the connection phase, without explicitly making the request.

''Return values'':

Must only return one of the following:
* '''sys''' (string): A shorthand for defining known machines (mostly video game systems). See below for supported values
* '''machine''' (object): A longhand for defining the complete machine architecture. TODO: Specify all object elements and their meanings.

The currently supported values for '''sys''' are:
* '''NES''': Nintendo Entertainment System
* '''SNES''': Super Nintendo Entertainment System
* '''N64''': Nintendo 64
* '''NGC''': Nintendo GameCube

* '''PS1''': Sony Playstation
* '''PS2''': Sony Playstation 2

Planned future support:
* '''GB''': Nintendo GameBoy
* '''GBC''': Nintendo GameBoy Color
* '''GBA''': Nintendo GameBoy Advance
* '''NDS''': Nintendo DS/DSi
* '''VB''': Nintendo Virtual Boy
* '''WII''': Nintendo Wii

* '''SMS''': Sega Master System
* '''SMD''': Sega Genesis/Mega Drive
* '''SCD''': Sega CD/Mega CD
* '''32X''': Sega 32X
* '''GG''': Sega GameGear
* '''SAT''': Sega Saturn
* '''DC''': Sega Dreamcast

* '''PSP''': Sony Playstation Portable

* '''TGC''': Tiger Game.com

Considering support "some day":
* '''3DO''': 3DO
* '''A26''': Atari 2600
* '''LYNX''': Atari Lynx/Lynx II
* '''JAG''': Atari Jaguar
* '''WS''': Bandai Wonder Swan
* '''WSC''': Bandai Wonder Swan Color
* '''GP32''': GamePark 32
* '''GP2X''': GamePark Holdings GP2X
* '''WIZ''': GamePark Holdings GP2X Wiz
* '''XBOX''': Microsoft XBox
* '''X360''': Microsoft XBox 360
* '''TG16''': NEC TurboGrafx-16/PC Engine
* '''TGCD''': NEC TurboGrafx-CD/PC Engine CD
* '''SGFX''': NEC SuperGrafx
* '''NGEO''': Neo*Geo AES/MVS
* '''NGP''': Neo*Geo Pocket
* '''NGPC''': Neo*Geo Pocket Color
* '''PKMN''': Nintendo Pokémon Mini
* '''SGB''': Nintendo Super GameBoy
* '''CDI''': Philips CD-i
* '''VMU''': Sega Dreamcast VMU
* '''PS3''': Sony Playstation 3
* '''SV''': SuperVision

===Proposal Comments===

The "Protocol Goals" have lead much of the motivation for developing this proposal. This presents what seems to be several very strange choices, at first glance. The choice of representing floating point numbers as a string of text seems extremely odd, until you consider that using the target-native floating point data format would make no sense when sending that data to a remote client (which is most likely running on a different architecture, and may have totally different native floating point formats). In order to express floating points with arbitrary precision in a data format-agnostic way, it is necessary to use a non-native format like text.

Another oddity in this spec is the use of VLIs (variable-length integers) and their affect on the rest of the format. The main purpose for using VLIs is address widths. Some architectures can express their full address range within a single byte. Others require up to 8 bytes for a full 64-bit address range. Future architectures are by no means limited to 64-bit address widths. For this very reason, it is necessary to scale down as well as up. A VLI can express an address in as little as a single byte, or scale upward to arbitrarily large numbers. This makes VLIs perfect for addressing requirements among any architecture.

VLIs present their own issues, however. For example, expressing a negative number as a VLI is nearly incomprehensible. Some might be inclined to reserve one bit within a VLI to indicate signedness, but that's another bit that cannot be used to minimize VLI overhead. The overhead is additional bytes required to represent a full number in a VLI system. For example, it is common for numbers 0 - 127 to be contained entirely within a single byte, including the overhead. But numbers between 128 - 255 require an additional byte to include more VLI "header" information (used to extend VLIs into arbitrarily long numbers). This is counter-intuitive, where a single byte itself can hold numbers between 0 - 255. Adding an additional sign bit reduces the range of VLIs by half: a single byte can only encode numbers between 0 - 63.

The solution is to use a different data type specifically for expressing negative numbers. The VLI is encoded just like a positive number, but when interpreting the VLI, it must be converted to a negative number by either subtracting from zero (0 - ''n'') or multiplying by negative one (''n'' * -1). This is referred to as a "one's complement".

In general, the efficiency of a VLI is very static. That means, a number using 0 - 7 bits of data (for example, the number "0" uses 0 bits of data, and the number "64" [binary: 100000] uses 6 bits) can be encoded into a single byte, a number using 8 - 14 bits can be encoded into 2 bytes, a number using 15 - 21 bits can be encoded into 3 bytes, etc. See http://www.dlugosz.com/ZIP2/VLI.html for more information on the kind of VLI I am considering for this proposal.

[[Category:Developer_Documentation]]

As85

2012-12-30T05:19:12Z

Parasyte: /* Development */ Update source link to git

as85 is a simple assembler for the Sharp sm8521; the same microcontroller used in the Tiger Game.com. Game.com was released in 1998 and had only a few games ever made for it. It also has not had any homebrew games made for it. as85 is an attempt to build an assembler that will help hackers write homebrew code that will run on Game.com hardware.

Documentation on the Game.com hardware is available at [http://gamecom.guruwork.de/ Game.commies].

==Download==

The source code is available at http://git.kodewerx.org/as85/src/

==Current Progress==

The current state of as85 is "almost usable, but not quite there yet." A number of bugs exist which need to be fixed before it can be used as a development tool:

* [http://bugzilla.kodewerx.org/show_bug.cgi?id=2 Bug 2]: Add support for jump/call/branch instructions
* [http://bugzilla.kodewerx.org/show_bug.cgi?id=3 Bug 3]: Output object code
* [http://bugzilla.kodewerx.org/show_bug.cgi?id=4 Bug 4]: Write a linker

I've also filed a bug about giving the project a better name [http://bugzilla.kodewerx.org/show_bug.cgi?id=5].

==Usage==

The program takes one argument; the file name of an sm8521 assembly file. The files in the [http://git.kodewerx.org/as85/src/09f8ba9f769d47fc2038e81d9cc5866d80f96748/examples?at=master /examples] directory are a good place to start.

* test.asm is an example of what the beginning of a Game.com program might look like; it contains a header, and some [random] instructions to give you an idea.
* test2.asm is for testing the integrity of the parser with complex strings.
* test3.asm lists all possible sm8521 instructions; for verifying the output binary is correct.

==Example Output==

The following command:
$ ./as85 ../examples/test3.asm

Produces the following output:
00 01 clr R1
01 01 neg R1
02 01 com R1
03 01 rr R1
04 01 rl R1
05 01 rrc R1
06 01 rlc R1
07 01 srl R1
08 01 inc R1
09 01 dec R1
0A 01 sra R1
0B 01 sll R1
0C 01 da R1
0D 01 swap R1
0E 01 push R1
0F 01 pop R1
10 0A cmp r1, r2
11 0A add r1, r2
12 0A sub r1, r2
13 0A adc r1, r2
14 0A sbc r1, r2
15 0A and r1, r2
16 0A or r1, r2
17 0A xor r1, r2
18 02 incw RR2
19 02 decw RR2
1A 08 clr @r1
1A 09 neg @r1
1A 0A com @r1
1A 0B rr @r1
1A 0C rl @r1
1A 0D rrc @r1
1A 0E rlc @r1
1A 0F srl @r1
1B 08 inc @r1
1B 09 dec @r1
1B 0A sra @r1
1B 0B sll @r1
1B 0C da @r1
1B 0D swap @r1
1B 0E push @r1
1B 0F pop @r1
1C 07 24 bclr 0xFF24, #7
1C 0F 94 bclr 0x94(r1), #7
1D 07 24 bset 0xFF24, #7
1D 0F 94 bset 0x94(r1), #7
1E 02 pushw RR2
1F 02 popw RR2
20 0A cmp r1, @r2
20 4A cmp r1, (r2)+
20 88 94 cmp r1, @0x94
20 8A 94 cmp r1, 0x94(r2)
20 CA cmp r1, -(r2)
21 0A add r1, @r2
21 4A add r1, (r2)+
21 88 94 add r1, @0x94
21 8A 94 add r1, 0x94(r2)
21 CA add r1, -(r2)
22 0A sub r1, @r2
22 4A sub r1, (r2)+
22 88 94 sub r1, @0x94
22 8A 94 sub r1, 0x94(r2)
22 CA sub r1, -(r2)
23 0A adc r1, @r2
23 4A adc r1, (r2)+
23 88 94 adc r1, @0x94
23 8A 94 adc r1, 0x94(r2)
23 CA adc r1, -(r2)
24 0A sbc r1, @r2
24 4A sbc r1, (r2)+
24 88 94 sbc r1, @0x94
24 8A 94 sbc r1, 0x94(r2)
24 CA sbc r1, -(r2)
25 0A and r1, @r2
25 4A and r1, (r2)+
25 88 94 and r1, @0x94
25 8A 94 and r1, 0x94(r2)
25 CA and r1, -(r2)
26 0A or r1, @r2
26 4A or r1, (r2)+
26 88 94 or r1, @0x94
26 8A 94 or r1, 0x94(r2)
26 CA or r1, -(r2)
27 0A xor r1, @r2
27 4A xor r1, (r2)+
27 88 94 xor r1, @0x94
27 8A 94 xor r1, 0x94(r2)
27 CA xor r1, -(r2)
28 0A mov r1, @r2
28 4A mov r1, (r2)+
28 88 94 mov r1, @0x94
28 8A 94 mov r1, 0x94(r2)
28 CA mov r1, -(r2)
29 11 mov @r1, r2
29 51 mov (r1)+, r2
29 90 94 mov @0x94, r2
29 91 94 mov 0x94(r1), r2
29 D1 mov -(r1), r2
2C 02 exts RR2
2E 94 mov ps0, #0x94
2F 94 01 btst R1, #0x94
30 09 cmp r1, @rr2
30 49 cmp r1, (rr2)+
30 88 24 94 cmp r1, @0x9424
30 89 24 94 cmp r1, 0x9424(rr2)
30 C9 cmp r1, -(rr2)
31 09 add r1, @rr2
31 49 add r1, (rr2)+
31 88 24 94 add r1, @0x9424
31 89 24 94 add r1, 0x9424(rr2)
31 C9 add r1, -(rr2)
32 09 sub r1, @rr2
32 49 sub r1, (rr2)+
32 88 24 94 sub r1, @0x9424
32 89 24 94 sub r1, 0x9424(rr2)
32 C9 sub r1, -(rr2)
33 09 adc r1, @rr2
33 49 adc r1, (rr2)+
33 88 24 94 adc r1, @0x9424
33 89 24 94 adc r1, 0x9424(rr2)
33 C9 adc r1, -(rr2)
34 09 sbc r1, @rr2
34 49 sbc r1, (rr2)+
34 88 24 94 sbc r1, @0x9424
34 89 24 94 sbc r1, 0x9424(rr2)
34 C9 sbc r1, -(rr2)
35 09 and r1, @rr2
35 49 and r1, (rr2)+
35 88 24 94 and r1, @0x9424
35 89 24 94 and r1, 0x9424(rr2)
35 C9 and r1, -(rr2)
36 09 or r1, @rr2
36 49 or r1, (rr2)+
36 88 24 94 or r1, @0x9424
36 89 24 94 or r1, 0x9424(rr2)
36 C9 or r1, -(rr2)
37 09 xor r1, @rr2
37 49 xor r1, (rr2)+
37 88 24 94 xor r1, @0x9424
37 89 24 94 xor r1, 0x9424(rr2)
37 C9 xor r1, -(rr2)
38 09 mov r1, @rr2
38 49 mov r1, (rr2)+
38 88 24 94 mov r1, @0x9424
38 89 24 94 mov r1, 0x9424(rr2)
38 C9 mov r1, -(rr2)
39 21 mov @rr2, r4
39 61 mov (rr2)+, r4
39 A0 24 94 mov @0x9424, r4
39 A1 24 94 mov 0x9424(rr2), r4
39 E1 mov -(rr2), r4
3A 14 movw rr2, @rr4
3A 54 movw rr2, (rr4)+
3A 90 24 94 movw rr2, @0x9424
3A 94 24 94 movw rr2, 0x9424(rr4)
3A D4 movw rr2, -(rr4)
3B 22 movw @rr2, rr4
3B 62 movw (rr2)+, rr4
3B A0 24 94 movw @0x9424, rr4
3B A2 24 94 movw 0x9424(rr2), rr4
3B E2 movw -(rr2), rr4
3C 14 movw rr2, rr4
40 02 01 cmp R1, R2
41 02 01 add R1, R2
42 02 01 sub R1, R2
43 02 01 adc R1, R2
44 02 01 sbc R1, R2
45 02 01 and R1, R2
46 02 01 or R1, R2
47 02 01 xor R1, R2
48 02 01 mov R1, R2
4A 04 02 movw RR2, RR4
4B 02 24 94 movw RR2, #0x9424
4C 04 02 mult RR2, R4
4D 94 02 mult RR2, #0x94
4E 07 01 bmov bf, R1, #7
4E 47 01 bmov R1, #7, bf
4F 07 01 bcmp bf, R1, #7
4F 47 01 band bf, R1, #7
4F 87 01 bor bf, R1, #7
4F C7 01 bxor bf, R1, #7
50 94 01 cmp R1, #0x94
51 94 01 add R1, #0x94
52 94 01 sub R1, #0x94
53 94 01 adc R1, #0x94
54 94 01 sbc R1, #0x94
55 94 01 and R1, #0x94
56 94 01 or R1, #0x94
57 94 01 xor R1, #0x94
58 94 01 mov R1, #0x94
5C 04 02 div RR2, RR4
5D 94 02 div RR2, #0x94
5E 01 94 02 movm R1, #0x94, R2
5F 01 94 24 movm R1, #0x94, #0x24
60 04 02 cmpw RR2, RR4
61 04 02 addw RR2, RR4
62 04 02 subw RR2, RR4
63 04 02 adcw RR2, RR4
64 04 02 sbcw RR2, RR4
65 04 02 andw RR2, RR4
66 04 02 orw RR2, RR4
67 04 02 xorw RR2, RR4
68 02 24 94 cmpw RR2, #0x9424
69 02 24 94 addw RR2, #0x9424
6A 02 24 94 subw RR2, #0x9424
6B 02 24 94 adcw RR2, #0x9424
6C 02 24 94 sbcw RR2, #0x9424
6D 02 24 94 andw RR2, #0x9424
6E 02 24 94 orw RR2, #0x9424
6F 02 24 94 xorw RR2, #0x9424
78 24 94 movw rr0, #0x9424
79 24 94 movw rr8, #0x9424
7A 24 94 movw rr2, #0x9424
7B 24 94 movw rr10, #0x9424
7C 24 94 movw rr4, #0x9424
7D 24 94 movw rr12, #0x9424
7E 24 94 movw rr6, #0x9424
7F 24 94 movw rr14, #0x9424
A0 01 bclr R1, #0
A1 01 bclr R1, #1
A2 01 bclr R1, #2
A3 01 bclr R1, #3
A4 01 bclr R1, #4
A5 01 bclr R1, #5
A6 01 bclr R1, #6
A7 01 bclr R1, #7
A8 01 bset R1, #0
A9 01 bset R1, #1
AA 01 bset R1, #2
AB 01 bset R1, #3
AC 01 bset R1, #4
AD 01 bset R1, #5
AE 01 bset R1, #6
AF 01 bset R1, #7
B0 01 mov r0, R1
B1 01 mov r1, R1
B2 01 mov r2, R1
B3 01 mov r3, R1
B4 01 mov r4, R1
B5 01 mov r5, R1
B6 01 mov r6, R1
B7 01 mov r7, R1
B8 01 mov R1, r0
B9 01 mov R1, r1
BA 01 mov R1, r2
BB 01 mov R1, r3
BC 01 mov R1, r4
BD 01 mov R1, r5
BE 01 mov R1, r6
BF 01 mov R1, r7
C0 94 mov r0, #0x94
C1 94 mov r1, #0x94
C2 94 mov r2, #0x94
C3 94 mov r3, #0x94
C4 94 mov r4, #0x94
C5 94 mov r5, #0x94
C6 94 mov r6, #0x94
C7 94 mov r7, #0x94
C8 94 mov ie0, #0x94
C9 94 mov ie1, #0x94
CA 94 mov ir0, #0x94
CB 94 mov ir1, #0x94
CC 94 mov p0, #0x94
CD 94 mov p1, #0x94
CE 94 mov p2, #0x94
CF 94 mov p3, #0x94
F0 stop
F1 halt
F8 ret
F9 iret
FA clrc
FB comc
FC setc
FD ei
FE di
FF nop
assemble() returned 0: OK
Clean up...

Note: The output binary has not been verified for accuracy.

==Development==

As85 is a fairly simple assembler. It doesn't use any sort of 'compiler-compiler' for lexical analysis. In fact, its lexical analysis is very specific to the sm8521 MCU.

The main loop (assemble() function, defined in [http://git.kodewerx.org/as85/src/09f8ba9f769d/src/asm.c?at=master#cl-315 asm.c]) does the input text parsing inline. (This should probably be moved out to a new source file.) After splitting a line into two pieces; op[0] containing the instruction, and op[1] containing its operands; a string comparison against op[0] is done over all supported instructions. If a match is found, the operands string is passed to a dynamically chosen function (from a [http://git.kodewerx.org/as85/src/09f8ba9f769d/src/inst.c?at=master#cl-156 function pointer table], indexed by the matched instruction). This function performs the lexical analysis required to decide which instruction we are trying to assemble.

Since the sm8521 is a CISC machine, its instruction set contains a number of different ways to assemble the same instruction mnemonic. For example, several different addressing modes for the ''mov'' instruction are shown above in the test3.asm output. The lexical analysis is the voodoo which picks the proper addressing mode and byte codes by analyzing the operands.

The lexical analyzing functions are defined in inst.c (following the function pointer table mentioned previously). The function handling the current instruction will test the operands string against a series of lexical patterns with the [http://git.kodewerx.org/as85/src/09f8ba9f769d/src/asm.c?at=master#cl-89 chk_pattern()] function (defined in asm.c, although this should probably be moved).

chk_pattern() uses a scanf-like formatting string, rather than a regular expression, which is more common in lexical analysis. Documentation for the formatting string can be found in [http://git.kodewerx.org/as85/src/09f8ba9f769d/src/inst.h?at=master#cl-29 inst.h]. With just a few pattern primitives, any of the sm8521's addressing modes can be matched, with the matching primitives output as part of an array. It makes good use of the format scanners defined in [http://git.kodewerx.org/as85/src/09f8ba9f769d/src/scan.c?at=master scan.c].

If chk_pattern() manages to find a match, the matching data may be further checked for validity on a per-context basis. Finally, the full instruction byte codes will be put together and returned to the assemble() loop. This is where the object code would be built. Currently the only thing that happens now is dumping the assembled instruction to stdout, in a debug build. [http://git.kodewerx.org/as85/src/09f8ba9f769d/src/asm.c?at=master#cl-464]

===Optimization Concerns===

Some optimization could be done within this lexical analysis process. The first improvement would be replacing the linear string comparison with a binary search tree. The second thing that would help in this immediate area would be replacing the string comparison itself with a hash comparison. The hash algorithm would have to be suitably small and fast enough to make much of a difference.

The next big optimization would be rearranging the chk_pattern() calls within each lexical analyzer to check the most likely patterns first. The best way to choose the best order is static analysis of sm8521 source code, which is obviously in very short supply. Accurate disassemblies of commercial Game.com games would be helpful to this end, however.

[[Category:Developer_Documentation]]

As85

2012-12-30T05:14:59Z

Parasyte: /* Usage */ Update source link to git

as85 is a simple assembler for the Sharp sm8521; the same microcontroller used in the Tiger Game.com. Game.com was released in 1998 and had only a few games ever made for it. It also has not had any homebrew games made for it. as85 is an attempt to build an assembler that will help hackers write homebrew code that will run on Game.com hardware.

Documentation on the Game.com hardware is available at [http://gamecom.guruwork.de/ Game.commies].

==Download==

The source code is available at http://git.kodewerx.org/as85/src/

==Current Progress==

The current state of as85 is "almost usable, but not quite there yet." A number of bugs exist which need to be fixed before it can be used as a development tool:

* [http://bugzilla.kodewerx.org/show_bug.cgi?id=2 Bug 2]: Add support for jump/call/branch instructions
* [http://bugzilla.kodewerx.org/show_bug.cgi?id=3 Bug 3]: Output object code
* [http://bugzilla.kodewerx.org/show_bug.cgi?id=4 Bug 4]: Write a linker

I've also filed a bug about giving the project a better name [http://bugzilla.kodewerx.org/show_bug.cgi?id=5].

==Usage==

The program takes one argument; the file name of an sm8521 assembly file. The files in the [http://git.kodewerx.org/as85/src/09f8ba9f769d47fc2038e81d9cc5866d80f96748/examples?at=master /examples] directory are a good place to start.

* test.asm is an example of what the beginning of a Game.com program might look like; it contains a header, and some [random] instructions to give you an idea.
* test2.asm is for testing the integrity of the parser with complex strings.
* test3.asm lists all possible sm8521 instructions; for verifying the output binary is correct.

==Example Output==

The following command:
$ ./as85 ../examples/test3.asm

Produces the following output:
00 01 clr R1
01 01 neg R1
02 01 com R1
03 01 rr R1
04 01 rl R1
05 01 rrc R1
06 01 rlc R1
07 01 srl R1
08 01 inc R1
09 01 dec R1
0A 01 sra R1
0B 01 sll R1
0C 01 da R1
0D 01 swap R1
0E 01 push R1
0F 01 pop R1
10 0A cmp r1, r2
11 0A add r1, r2
12 0A sub r1, r2
13 0A adc r1, r2
14 0A sbc r1, r2
15 0A and r1, r2
16 0A or r1, r2
17 0A xor r1, r2
18 02 incw RR2
19 02 decw RR2
1A 08 clr @r1
1A 09 neg @r1
1A 0A com @r1
1A 0B rr @r1
1A 0C rl @r1
1A 0D rrc @r1
1A 0E rlc @r1
1A 0F srl @r1
1B 08 inc @r1
1B 09 dec @r1
1B 0A sra @r1
1B 0B sll @r1
1B 0C da @r1
1B 0D swap @r1
1B 0E push @r1
1B 0F pop @r1
1C 07 24 bclr 0xFF24, #7
1C 0F 94 bclr 0x94(r1), #7
1D 07 24 bset 0xFF24, #7
1D 0F 94 bset 0x94(r1), #7
1E 02 pushw RR2
1F 02 popw RR2
20 0A cmp r1, @r2
20 4A cmp r1, (r2)+
20 88 94 cmp r1, @0x94
20 8A 94 cmp r1, 0x94(r2)
20 CA cmp r1, -(r2)
21 0A add r1, @r2
21 4A add r1, (r2)+
21 88 94 add r1, @0x94
21 8A 94 add r1, 0x94(r2)
21 CA add r1, -(r2)
22 0A sub r1, @r2
22 4A sub r1, (r2)+
22 88 94 sub r1, @0x94
22 8A 94 sub r1, 0x94(r2)
22 CA sub r1, -(r2)
23 0A adc r1, @r2
23 4A adc r1, (r2)+
23 88 94 adc r1, @0x94
23 8A 94 adc r1, 0x94(r2)
23 CA adc r1, -(r2)
24 0A sbc r1, @r2
24 4A sbc r1, (r2)+
24 88 94 sbc r1, @0x94
24 8A 94 sbc r1, 0x94(r2)
24 CA sbc r1, -(r2)
25 0A and r1, @r2
25 4A and r1, (r2)+
25 88 94 and r1, @0x94
25 8A 94 and r1, 0x94(r2)
25 CA and r1, -(r2)
26 0A or r1, @r2
26 4A or r1, (r2)+
26 88 94 or r1, @0x94
26 8A 94 or r1, 0x94(r2)
26 CA or r1, -(r2)
27 0A xor r1, @r2
27 4A xor r1, (r2)+
27 88 94 xor r1, @0x94
27 8A 94 xor r1, 0x94(r2)
27 CA xor r1, -(r2)
28 0A mov r1, @r2
28 4A mov r1, (r2)+
28 88 94 mov r1, @0x94
28 8A 94 mov r1, 0x94(r2)
28 CA mov r1, -(r2)
29 11 mov @r1, r2
29 51 mov (r1)+, r2
29 90 94 mov @0x94, r2
29 91 94 mov 0x94(r1), r2
29 D1 mov -(r1), r2
2C 02 exts RR2
2E 94 mov ps0, #0x94
2F 94 01 btst R1, #0x94
30 09 cmp r1, @rr2
30 49 cmp r1, (rr2)+
30 88 24 94 cmp r1, @0x9424
30 89 24 94 cmp r1, 0x9424(rr2)
30 C9 cmp r1, -(rr2)
31 09 add r1, @rr2
31 49 add r1, (rr2)+
31 88 24 94 add r1, @0x9424
31 89 24 94 add r1, 0x9424(rr2)
31 C9 add r1, -(rr2)
32 09 sub r1, @rr2
32 49 sub r1, (rr2)+
32 88 24 94 sub r1, @0x9424
32 89 24 94 sub r1, 0x9424(rr2)
32 C9 sub r1, -(rr2)
33 09 adc r1, @rr2
33 49 adc r1, (rr2)+
33 88 24 94 adc r1, @0x9424
33 89 24 94 adc r1, 0x9424(rr2)
33 C9 adc r1, -(rr2)
34 09 sbc r1, @rr2
34 49 sbc r1, (rr2)+
34 88 24 94 sbc r1, @0x9424
34 89 24 94 sbc r1, 0x9424(rr2)
34 C9 sbc r1, -(rr2)
35 09 and r1, @rr2
35 49 and r1, (rr2)+
35 88 24 94 and r1, @0x9424
35 89 24 94 and r1, 0x9424(rr2)
35 C9 and r1, -(rr2)
36 09 or r1, @rr2
36 49 or r1, (rr2)+
36 88 24 94 or r1, @0x9424
36 89 24 94 or r1, 0x9424(rr2)
36 C9 or r1, -(rr2)
37 09 xor r1, @rr2
37 49 xor r1, (rr2)+
37 88 24 94 xor r1, @0x9424
37 89 24 94 xor r1, 0x9424(rr2)
37 C9 xor r1, -(rr2)
38 09 mov r1, @rr2
38 49 mov r1, (rr2)+
38 88 24 94 mov r1, @0x9424
38 89 24 94 mov r1, 0x9424(rr2)
38 C9 mov r1, -(rr2)
39 21 mov @rr2, r4
39 61 mov (rr2)+, r4
39 A0 24 94 mov @0x9424, r4
39 A1 24 94 mov 0x9424(rr2), r4
39 E1 mov -(rr2), r4
3A 14 movw rr2, @rr4
3A 54 movw rr2, (rr4)+
3A 90 24 94 movw rr2, @0x9424
3A 94 24 94 movw rr2, 0x9424(rr4)
3A D4 movw rr2, -(rr4)
3B 22 movw @rr2, rr4
3B 62 movw (rr2)+, rr4
3B A0 24 94 movw @0x9424, rr4
3B A2 24 94 movw 0x9424(rr2), rr4
3B E2 movw -(rr2), rr4
3C 14 movw rr2, rr4
40 02 01 cmp R1, R2
41 02 01 add R1, R2
42 02 01 sub R1, R2
43 02 01 adc R1, R2
44 02 01 sbc R1, R2
45 02 01 and R1, R2
46 02 01 or R1, R2
47 02 01 xor R1, R2
48 02 01 mov R1, R2
4A 04 02 movw RR2, RR4
4B 02 24 94 movw RR2, #0x9424
4C 04 02 mult RR2, R4
4D 94 02 mult RR2, #0x94
4E 07 01 bmov bf, R1, #7
4E 47 01 bmov R1, #7, bf
4F 07 01 bcmp bf, R1, #7
4F 47 01 band bf, R1, #7
4F 87 01 bor bf, R1, #7
4F C7 01 bxor bf, R1, #7
50 94 01 cmp R1, #0x94
51 94 01 add R1, #0x94
52 94 01 sub R1, #0x94
53 94 01 adc R1, #0x94
54 94 01 sbc R1, #0x94
55 94 01 and R1, #0x94
56 94 01 or R1, #0x94
57 94 01 xor R1, #0x94
58 94 01 mov R1, #0x94
5C 04 02 div RR2, RR4
5D 94 02 div RR2, #0x94
5E 01 94 02 movm R1, #0x94, R2
5F 01 94 24 movm R1, #0x94, #0x24
60 04 02 cmpw RR2, RR4
61 04 02 addw RR2, RR4
62 04 02 subw RR2, RR4
63 04 02 adcw RR2, RR4
64 04 02 sbcw RR2, RR4
65 04 02 andw RR2, RR4
66 04 02 orw RR2, RR4
67 04 02 xorw RR2, RR4
68 02 24 94 cmpw RR2, #0x9424
69 02 24 94 addw RR2, #0x9424
6A 02 24 94 subw RR2, #0x9424
6B 02 24 94 adcw RR2, #0x9424
6C 02 24 94 sbcw RR2, #0x9424
6D 02 24 94 andw RR2, #0x9424
6E 02 24 94 orw RR2, #0x9424
6F 02 24 94 xorw RR2, #0x9424
78 24 94 movw rr0, #0x9424
79 24 94 movw rr8, #0x9424
7A 24 94 movw rr2, #0x9424
7B 24 94 movw rr10, #0x9424
7C 24 94 movw rr4, #0x9424
7D 24 94 movw rr12, #0x9424
7E 24 94 movw rr6, #0x9424
7F 24 94 movw rr14, #0x9424
A0 01 bclr R1, #0
A1 01 bclr R1, #1
A2 01 bclr R1, #2
A3 01 bclr R1, #3
A4 01 bclr R1, #4
A5 01 bclr R1, #5
A6 01 bclr R1, #6
A7 01 bclr R1, #7
A8 01 bset R1, #0
A9 01 bset R1, #1
AA 01 bset R1, #2
AB 01 bset R1, #3
AC 01 bset R1, #4
AD 01 bset R1, #5
AE 01 bset R1, #6
AF 01 bset R1, #7
B0 01 mov r0, R1
B1 01 mov r1, R1
B2 01 mov r2, R1
B3 01 mov r3, R1
B4 01 mov r4, R1
B5 01 mov r5, R1
B6 01 mov r6, R1
B7 01 mov r7, R1
B8 01 mov R1, r0
B9 01 mov R1, r1
BA 01 mov R1, r2
BB 01 mov R1, r3
BC 01 mov R1, r4
BD 01 mov R1, r5
BE 01 mov R1, r6
BF 01 mov R1, r7
C0 94 mov r0, #0x94
C1 94 mov r1, #0x94
C2 94 mov r2, #0x94
C3 94 mov r3, #0x94
C4 94 mov r4, #0x94
C5 94 mov r5, #0x94
C6 94 mov r6, #0x94
C7 94 mov r7, #0x94
C8 94 mov ie0, #0x94
C9 94 mov ie1, #0x94
CA 94 mov ir0, #0x94
CB 94 mov ir1, #0x94
CC 94 mov p0, #0x94
CD 94 mov p1, #0x94
CE 94 mov p2, #0x94
CF 94 mov p3, #0x94
F0 stop
F1 halt
F8 ret
F9 iret
FA clrc
FB comc
FC setc
FD ei
FE di
FF nop
assemble() returned 0: OK
Clean up...

Note: The output binary has not been verified for accuracy.

==Development==

As85 is a fairly simple assembler. It doesn't use any sort of 'compiler-compiler' for lexical analysis. In fact, its lexical analysis is very specific to the sm8521 MCU.

The main loop (assemble() function, defined in [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/src/asm.c?at=default#cl-315 asm.c]) does the input text parsing inline. (This should probably be moved out to a new source file.) After splitting a line into two pieces; op[0] containing the instruction, and op[1] containing its operands; a string comparison against op[0] is done over all supported instructions. If a match is found, the operands string is passed to a dynamically chosen function (from a [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/src/inst.c?at=default#cl-156 function pointer table], indexed by the matched instruction). This function performs the lexical analysis required to decide which instruction we are trying to assemble.

Since the sm8521 is a CISC machine, its instruction set contains a number of different ways to assemble the same instruction mnemonic. For example, several different addressing modes for the ''mov'' instruction are shown above in the test3.asm output. The lexical analysis is the voodoo which picks the proper addressing mode and byte codes by analyzing the operands.

The lexical analyzing functions are defined in inst.c (following the function pointer table mentioned previously). The function handling the current instruction will test the operands string against a series of lexical patterns with the [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/src/asm.c?at=default#cl-89 chk_pattern()] function (defined in asm.c, although this should probably be moved).

chk_pattern() uses a scanf-like formatting string, rather than a regular expression, which is more common in lexical analysis. Documentation for the formatting string can be found in [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/src/inst.h?at=default#cl-29 inst.h]. With just a few pattern primitives, any of the sm8521's addressing modes can be matched, with the matching primitives output as part of an array. It makes good use of the format scanners defined in [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/src/scan.c?at=default scan.c].

If chk_pattern() manages to find a match, the matching data may be further checked for validity on a per-context basis. Finally, the full instruction byte codes will be put together and returned to the assemble() loop. This is where the object code would be built. Currently the only thing that happens now is dumping the assembled instruction to stdout, in a debug build. [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/src/asm.c?at=default#cl-464]

===Optimization Concerns===

Some optimization could be done within this lexical analysis process. The first improvement would be replacing the linear string comparison with a binary search tree. The second thing that would help in this immediate area would be replacing the string comparison itself with a hash comparison. The hash algorithm would have to be suitably small and fast enough to make much of a difference.

The next big optimization would be rearranging the chk_pattern() calls within each lexical analyzer to check the most likely patterns first. The best way to choose the best order is static analysis of sm8521 source code, which is obviously in very short supply. Accurate disassemblies of commercial Game.com games would be helpful to this end, however.

[[Category:Developer_Documentation]]

As85

2012-12-30T04:13:23Z

Parasyte: /* Development */ Update source links

as85 is a simple assembler for the Sharp sm8521; the same microcontroller used in the Tiger Game.com. Game.com was released in 1998 and had only a few games ever made for it. It also has not had any homebrew games made for it. as85 is an attempt to build an assembler that will help hackers write homebrew code that will run on Game.com hardware.

Documentation on the Game.com hardware is available at [http://gamecom.guruwork.de/ Game.commies].

==Download==

The source code is available at http://git.kodewerx.org/as85/src/

==Current Progress==

The current state of as85 is "almost usable, but not quite there yet." A number of bugs exist which need to be fixed before it can be used as a development tool:

* [http://bugzilla.kodewerx.org/show_bug.cgi?id=2 Bug 2]: Add support for jump/call/branch instructions
* [http://bugzilla.kodewerx.org/show_bug.cgi?id=3 Bug 3]: Output object code
* [http://bugzilla.kodewerx.org/show_bug.cgi?id=4 Bug 4]: Write a linker

I've also filed a bug about giving the project a better name [http://bugzilla.kodewerx.org/show_bug.cgi?id=5].

==Usage==

The program takes one argument; the file name of an sm8521 assembly file. The files in the [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/examples?at=default /examples] directory are a good place to start.

* test.asm is an example of what the beginning of a Game.com program might look like; it contains a header, and some [random] instructions to give you an idea.
* test2.asm is for testing the integrity of the parser with complex strings.
* test3.asm lists all possible sm8521 instructions; for verifying the output binary is correct.

==Example Output==

The following command:
$ ./as85 ../examples/test3.asm

Produces the following output:
00 01 clr R1
01 01 neg R1
02 01 com R1
03 01 rr R1
04 01 rl R1
05 01 rrc R1
06 01 rlc R1
07 01 srl R1
08 01 inc R1
09 01 dec R1
0A 01 sra R1
0B 01 sll R1
0C 01 da R1
0D 01 swap R1
0E 01 push R1
0F 01 pop R1
10 0A cmp r1, r2
11 0A add r1, r2
12 0A sub r1, r2
13 0A adc r1, r2
14 0A sbc r1, r2
15 0A and r1, r2
16 0A or r1, r2
17 0A xor r1, r2
18 02 incw RR2
19 02 decw RR2
1A 08 clr @r1
1A 09 neg @r1
1A 0A com @r1
1A 0B rr @r1
1A 0C rl @r1
1A 0D rrc @r1
1A 0E rlc @r1
1A 0F srl @r1
1B 08 inc @r1
1B 09 dec @r1
1B 0A sra @r1
1B 0B sll @r1
1B 0C da @r1
1B 0D swap @r1
1B 0E push @r1
1B 0F pop @r1
1C 07 24 bclr 0xFF24, #7
1C 0F 94 bclr 0x94(r1), #7
1D 07 24 bset 0xFF24, #7
1D 0F 94 bset 0x94(r1), #7
1E 02 pushw RR2
1F 02 popw RR2
20 0A cmp r1, @r2
20 4A cmp r1, (r2)+
20 88 94 cmp r1, @0x94
20 8A 94 cmp r1, 0x94(r2)
20 CA cmp r1, -(r2)
21 0A add r1, @r2
21 4A add r1, (r2)+
21 88 94 add r1, @0x94
21 8A 94 add r1, 0x94(r2)
21 CA add r1, -(r2)
22 0A sub r1, @r2
22 4A sub r1, (r2)+
22 88 94 sub r1, @0x94
22 8A 94 sub r1, 0x94(r2)
22 CA sub r1, -(r2)
23 0A adc r1, @r2
23 4A adc r1, (r2)+
23 88 94 adc r1, @0x94
23 8A 94 adc r1, 0x94(r2)
23 CA adc r1, -(r2)
24 0A sbc r1, @r2
24 4A sbc r1, (r2)+
24 88 94 sbc r1, @0x94
24 8A 94 sbc r1, 0x94(r2)
24 CA sbc r1, -(r2)
25 0A and r1, @r2
25 4A and r1, (r2)+
25 88 94 and r1, @0x94
25 8A 94 and r1, 0x94(r2)
25 CA and r1, -(r2)
26 0A or r1, @r2
26 4A or r1, (r2)+
26 88 94 or r1, @0x94
26 8A 94 or r1, 0x94(r2)
26 CA or r1, -(r2)
27 0A xor r1, @r2
27 4A xor r1, (r2)+
27 88 94 xor r1, @0x94
27 8A 94 xor r1, 0x94(r2)
27 CA xor r1, -(r2)
28 0A mov r1, @r2
28 4A mov r1, (r2)+
28 88 94 mov r1, @0x94
28 8A 94 mov r1, 0x94(r2)
28 CA mov r1, -(r2)
29 11 mov @r1, r2
29 51 mov (r1)+, r2
29 90 94 mov @0x94, r2
29 91 94 mov 0x94(r1), r2
29 D1 mov -(r1), r2
2C 02 exts RR2
2E 94 mov ps0, #0x94
2F 94 01 btst R1, #0x94
30 09 cmp r1, @rr2
30 49 cmp r1, (rr2)+
30 88 24 94 cmp r1, @0x9424
30 89 24 94 cmp r1, 0x9424(rr2)
30 C9 cmp r1, -(rr2)
31 09 add r1, @rr2
31 49 add r1, (rr2)+
31 88 24 94 add r1, @0x9424
31 89 24 94 add r1, 0x9424(rr2)
31 C9 add r1, -(rr2)
32 09 sub r1, @rr2
32 49 sub r1, (rr2)+
32 88 24 94 sub r1, @0x9424
32 89 24 94 sub r1, 0x9424(rr2)
32 C9 sub r1, -(rr2)
33 09 adc r1, @rr2
33 49 adc r1, (rr2)+
33 88 24 94 adc r1, @0x9424
33 89 24 94 adc r1, 0x9424(rr2)
33 C9 adc r1, -(rr2)
34 09 sbc r1, @rr2
34 49 sbc r1, (rr2)+
34 88 24 94 sbc r1, @0x9424
34 89 24 94 sbc r1, 0x9424(rr2)
34 C9 sbc r1, -(rr2)
35 09 and r1, @rr2
35 49 and r1, (rr2)+
35 88 24 94 and r1, @0x9424
35 89 24 94 and r1, 0x9424(rr2)
35 C9 and r1, -(rr2)
36 09 or r1, @rr2
36 49 or r1, (rr2)+
36 88 24 94 or r1, @0x9424
36 89 24 94 or r1, 0x9424(rr2)
36 C9 or r1, -(rr2)
37 09 xor r1, @rr2
37 49 xor r1, (rr2)+
37 88 24 94 xor r1, @0x9424
37 89 24 94 xor r1, 0x9424(rr2)
37 C9 xor r1, -(rr2)
38 09 mov r1, @rr2
38 49 mov r1, (rr2)+
38 88 24 94 mov r1, @0x9424
38 89 24 94 mov r1, 0x9424(rr2)
38 C9 mov r1, -(rr2)
39 21 mov @rr2, r4
39 61 mov (rr2)+, r4
39 A0 24 94 mov @0x9424, r4
39 A1 24 94 mov 0x9424(rr2), r4
39 E1 mov -(rr2), r4
3A 14 movw rr2, @rr4
3A 54 movw rr2, (rr4)+
3A 90 24 94 movw rr2, @0x9424
3A 94 24 94 movw rr2, 0x9424(rr4)
3A D4 movw rr2, -(rr4)
3B 22 movw @rr2, rr4
3B 62 movw (rr2)+, rr4
3B A0 24 94 movw @0x9424, rr4
3B A2 24 94 movw 0x9424(rr2), rr4
3B E2 movw -(rr2), rr4
3C 14 movw rr2, rr4
40 02 01 cmp R1, R2
41 02 01 add R1, R2
42 02 01 sub R1, R2
43 02 01 adc R1, R2
44 02 01 sbc R1, R2
45 02 01 and R1, R2
46 02 01 or R1, R2
47 02 01 xor R1, R2
48 02 01 mov R1, R2
4A 04 02 movw RR2, RR4
4B 02 24 94 movw RR2, #0x9424
4C 04 02 mult RR2, R4
4D 94 02 mult RR2, #0x94
4E 07 01 bmov bf, R1, #7
4E 47 01 bmov R1, #7, bf
4F 07 01 bcmp bf, R1, #7
4F 47 01 band bf, R1, #7
4F 87 01 bor bf, R1, #7
4F C7 01 bxor bf, R1, #7
50 94 01 cmp R1, #0x94
51 94 01 add R1, #0x94
52 94 01 sub R1, #0x94
53 94 01 adc R1, #0x94
54 94 01 sbc R1, #0x94
55 94 01 and R1, #0x94
56 94 01 or R1, #0x94
57 94 01 xor R1, #0x94
58 94 01 mov R1, #0x94
5C 04 02 div RR2, RR4
5D 94 02 div RR2, #0x94
5E 01 94 02 movm R1, #0x94, R2
5F 01 94 24 movm R1, #0x94, #0x24
60 04 02 cmpw RR2, RR4
61 04 02 addw RR2, RR4
62 04 02 subw RR2, RR4
63 04 02 adcw RR2, RR4
64 04 02 sbcw RR2, RR4
65 04 02 andw RR2, RR4
66 04 02 orw RR2, RR4
67 04 02 xorw RR2, RR4
68 02 24 94 cmpw RR2, #0x9424
69 02 24 94 addw RR2, #0x9424
6A 02 24 94 subw RR2, #0x9424
6B 02 24 94 adcw RR2, #0x9424
6C 02 24 94 sbcw RR2, #0x9424
6D 02 24 94 andw RR2, #0x9424
6E 02 24 94 orw RR2, #0x9424
6F 02 24 94 xorw RR2, #0x9424
78 24 94 movw rr0, #0x9424
79 24 94 movw rr8, #0x9424
7A 24 94 movw rr2, #0x9424
7B 24 94 movw rr10, #0x9424
7C 24 94 movw rr4, #0x9424
7D 24 94 movw rr12, #0x9424
7E 24 94 movw rr6, #0x9424
7F 24 94 movw rr14, #0x9424
A0 01 bclr R1, #0
A1 01 bclr R1, #1
A2 01 bclr R1, #2
A3 01 bclr R1, #3
A4 01 bclr R1, #4
A5 01 bclr R1, #5
A6 01 bclr R1, #6
A7 01 bclr R1, #7
A8 01 bset R1, #0
A9 01 bset R1, #1
AA 01 bset R1, #2
AB 01 bset R1, #3
AC 01 bset R1, #4
AD 01 bset R1, #5
AE 01 bset R1, #6
AF 01 bset R1, #7
B0 01 mov r0, R1
B1 01 mov r1, R1
B2 01 mov r2, R1
B3 01 mov r3, R1
B4 01 mov r4, R1
B5 01 mov r5, R1
B6 01 mov r6, R1
B7 01 mov r7, R1
B8 01 mov R1, r0
B9 01 mov R1, r1
BA 01 mov R1, r2
BB 01 mov R1, r3
BC 01 mov R1, r4
BD 01 mov R1, r5
BE 01 mov R1, r6
BF 01 mov R1, r7
C0 94 mov r0, #0x94
C1 94 mov r1, #0x94
C2 94 mov r2, #0x94
C3 94 mov r3, #0x94
C4 94 mov r4, #0x94
C5 94 mov r5, #0x94
C6 94 mov r6, #0x94
C7 94 mov r7, #0x94
C8 94 mov ie0, #0x94
C9 94 mov ie1, #0x94
CA 94 mov ir0, #0x94
CB 94 mov ir1, #0x94
CC 94 mov p0, #0x94
CD 94 mov p1, #0x94
CE 94 mov p2, #0x94
CF 94 mov p3, #0x94
F0 stop
F1 halt
F8 ret
F9 iret
FA clrc
FB comc
FC setc
FD ei
FE di
FF nop
assemble() returned 0: OK
Clean up...

Note: The output binary has not been verified for accuracy.

==Development==

As85 is a fairly simple assembler. It doesn't use any sort of 'compiler-compiler' for lexical analysis. In fact, its lexical analysis is very specific to the sm8521 MCU.

The main loop (assemble() function, defined in [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/src/asm.c?at=default#cl-315 asm.c]) does the input text parsing inline. (This should probably be moved out to a new source file.) After splitting a line into two pieces; op[0] containing the instruction, and op[1] containing its operands; a string comparison against op[0] is done over all supported instructions. If a match is found, the operands string is passed to a dynamically chosen function (from a [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/src/inst.c?at=default#cl-156 function pointer table], indexed by the matched instruction). This function performs the lexical analysis required to decide which instruction we are trying to assemble.

Since the sm8521 is a CISC machine, its instruction set contains a number of different ways to assemble the same instruction mnemonic. For example, several different addressing modes for the ''mov'' instruction are shown above in the test3.asm output. The lexical analysis is the voodoo which picks the proper addressing mode and byte codes by analyzing the operands.

The lexical analyzing functions are defined in inst.c (following the function pointer table mentioned previously). The function handling the current instruction will test the operands string against a series of lexical patterns with the [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/src/asm.c?at=default#cl-89 chk_pattern()] function (defined in asm.c, although this should probably be moved).

chk_pattern() uses a scanf-like formatting string, rather than a regular expression, which is more common in lexical analysis. Documentation for the formatting string can be found in [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/src/inst.h?at=default#cl-29 inst.h]. With just a few pattern primitives, any of the sm8521's addressing modes can be matched, with the matching primitives output as part of an array. It makes good use of the format scanners defined in [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/src/scan.c?at=default scan.c].

If chk_pattern() manages to find a match, the matching data may be further checked for validity on a per-context basis. Finally, the full instruction byte codes will be put together and returned to the assemble() loop. This is where the object code would be built. Currently the only thing that happens now is dumping the assembled instruction to stdout, in a debug build. [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/src/asm.c?at=default#cl-464]

===Optimization Concerns===

Some optimization could be done within this lexical analysis process. The first improvement would be replacing the linear string comparison with a binary search tree. The second thing that would help in this immediate area would be replacing the string comparison itself with a hash comparison. The hash algorithm would have to be suitably small and fast enough to make much of a difference.

The next big optimization would be rearranging the chk_pattern() calls within each lexical analyzer to check the most likely patterns first. The best way to choose the best order is static analysis of sm8521 source code, which is obviously in very short supply. Accurate disassemblies of commercial Game.com games would be helpful to this end, however.

[[Category:Developer_Documentation]]

As85

2012-12-30T04:09:37Z

Parasyte: /* Usage */ Update source link

as85 is a simple assembler for the Sharp sm8521; the same microcontroller used in the Tiger Game.com. Game.com was released in 1998 and had only a few games ever made for it. It also has not had any homebrew games made for it. as85 is an attempt to build an assembler that will help hackers write homebrew code that will run on Game.com hardware.

Documentation on the Game.com hardware is available at [http://gamecom.guruwork.de/ Game.commies].

==Download==

The source code is available at http://git.kodewerx.org/as85/src/

==Current Progress==

The current state of as85 is "almost usable, but not quite there yet." A number of bugs exist which need to be fixed before it can be used as a development tool:

* [http://bugzilla.kodewerx.org/show_bug.cgi?id=2 Bug 2]: Add support for jump/call/branch instructions
* [http://bugzilla.kodewerx.org/show_bug.cgi?id=3 Bug 3]: Output object code
* [http://bugzilla.kodewerx.org/show_bug.cgi?id=4 Bug 4]: Write a linker

I've also filed a bug about giving the project a better name [http://bugzilla.kodewerx.org/show_bug.cgi?id=5].

==Usage==

The program takes one argument; the file name of an sm8521 assembly file. The files in the [http://git.kodewerx.org/as85/src/a90c47020126a58874638e66c0fbf90fb5ab53c2/examples?at=default /examples] directory are a good place to start.

* test.asm is an example of what the beginning of a Game.com program might look like; it contains a header, and some [random] instructions to give you an idea.
* test2.asm is for testing the integrity of the parser with complex strings.
* test3.asm lists all possible sm8521 instructions; for verifying the output binary is correct.

==Example Output==

The following command:
$ ./as85 ../examples/test3.asm

Produces the following output:
00 01 clr R1
01 01 neg R1
02 01 com R1
03 01 rr R1
04 01 rl R1
05 01 rrc R1
06 01 rlc R1
07 01 srl R1
08 01 inc R1
09 01 dec R1
0A 01 sra R1
0B 01 sll R1
0C 01 da R1
0D 01 swap R1
0E 01 push R1
0F 01 pop R1
10 0A cmp r1, r2
11 0A add r1, r2
12 0A sub r1, r2
13 0A adc r1, r2
14 0A sbc r1, r2
15 0A and r1, r2
16 0A or r1, r2
17 0A xor r1, r2
18 02 incw RR2
19 02 decw RR2
1A 08 clr @r1
1A 09 neg @r1
1A 0A com @r1
1A 0B rr @r1
1A 0C rl @r1
1A 0D rrc @r1
1A 0E rlc @r1
1A 0F srl @r1
1B 08 inc @r1
1B 09 dec @r1
1B 0A sra @r1
1B 0B sll @r1
1B 0C da @r1
1B 0D swap @r1
1B 0E push @r1
1B 0F pop @r1
1C 07 24 bclr 0xFF24, #7
1C 0F 94 bclr 0x94(r1), #7
1D 07 24 bset 0xFF24, #7
1D 0F 94 bset 0x94(r1), #7
1E 02 pushw RR2
1F 02 popw RR2
20 0A cmp r1, @r2
20 4A cmp r1, (r2)+
20 88 94 cmp r1, @0x94
20 8A 94 cmp r1, 0x94(r2)
20 CA cmp r1, -(r2)
21 0A add r1, @r2
21 4A add r1, (r2)+
21 88 94 add r1, @0x94
21 8A 94 add r1, 0x94(r2)
21 CA add r1, -(r2)
22 0A sub r1, @r2
22 4A sub r1, (r2)+
22 88 94 sub r1, @0x94
22 8A 94 sub r1, 0x94(r2)
22 CA sub r1, -(r2)
23 0A adc r1, @r2
23 4A adc r1, (r2)+
23 88 94 adc r1, @0x94
23 8A 94 adc r1, 0x94(r2)
23 CA adc r1, -(r2)
24 0A sbc r1, @r2
24 4A sbc r1, (r2)+
24 88 94 sbc r1, @0x94
24 8A 94 sbc r1, 0x94(r2)
24 CA sbc r1, -(r2)
25 0A and r1, @r2
25 4A and r1, (r2)+
25 88 94 and r1, @0x94
25 8A 94 and r1, 0x94(r2)
25 CA and r1, -(r2)
26 0A or r1, @r2
26 4A or r1, (r2)+
26 88 94 or r1, @0x94
26 8A 94 or r1, 0x94(r2)
26 CA or r1, -(r2)
27 0A xor r1, @r2
27 4A xor r1, (r2)+
27 88 94 xor r1, @0x94
27 8A 94 xor r1, 0x94(r2)
27 CA xor r1, -(r2)
28 0A mov r1, @r2
28 4A mov r1, (r2)+
28 88 94 mov r1, @0x94
28 8A 94 mov r1, 0x94(r2)
28 CA mov r1, -(r2)
29 11 mov @r1, r2
29 51 mov (r1)+, r2
29 90 94 mov @0x94, r2
29 91 94 mov 0x94(r1), r2
29 D1 mov -(r1), r2
2C 02 exts RR2
2E 94 mov ps0, #0x94
2F 94 01 btst R1, #0x94
30 09 cmp r1, @rr2
30 49 cmp r1, (rr2)+
30 88 24 94 cmp r1, @0x9424
30 89 24 94 cmp r1, 0x9424(rr2)
30 C9 cmp r1, -(rr2)
31 09 add r1, @rr2
31 49 add r1, (rr2)+
31 88 24 94 add r1, @0x9424
31 89 24 94 add r1, 0x9424(rr2)
31 C9 add r1, -(rr2)
32 09 sub r1, @rr2
32 49 sub r1, (rr2)+
32 88 24 94 sub r1, @0x9424
32 89 24 94 sub r1, 0x9424(rr2)
32 C9 sub r1, -(rr2)
33 09 adc r1, @rr2
33 49 adc r1, (rr2)+
33 88 24 94 adc r1, @0x9424
33 89 24 94 adc r1, 0x9424(rr2)
33 C9 adc r1, -(rr2)
34 09 sbc r1, @rr2
34 49 sbc r1, (rr2)+
34 88 24 94 sbc r1, @0x9424
34 89 24 94 sbc r1, 0x9424(rr2)
34 C9 sbc r1, -(rr2)
35 09 and r1, @rr2
35 49 and r1, (rr2)+
35 88 24 94 and r1, @0x9424
35 89 24 94 and r1, 0x9424(rr2)
35 C9 and r1, -(rr2)
36 09 or r1, @rr2
36 49 or r1, (rr2)+
36 88 24 94 or r1, @0x9424
36 89 24 94 or r1, 0x9424(rr2)
36 C9 or r1, -(rr2)
37 09 xor r1, @rr2
37 49 xor r1, (rr2)+
37 88 24 94 xor r1, @0x9424
37 89 24 94 xor r1, 0x9424(rr2)
37 C9 xor r1, -(rr2)
38 09 mov r1, @rr2
38 49 mov r1, (rr2)+
38 88 24 94 mov r1, @0x9424
38 89 24 94 mov r1, 0x9424(rr2)
38 C9 mov r1, -(rr2)
39 21 mov @rr2, r4
39 61 mov (rr2)+, r4
39 A0 24 94 mov @0x9424, r4
39 A1 24 94 mov 0x9424(rr2), r4
39 E1 mov -(rr2), r4
3A 14 movw rr2, @rr4
3A 54 movw rr2, (rr4)+
3A 90 24 94 movw rr2, @0x9424
3A 94 24 94 movw rr2, 0x9424(rr4)
3A D4 movw rr2, -(rr4)
3B 22 movw @rr2, rr4
3B 62 movw (rr2)+, rr4
3B A0 24 94 movw @0x9424, rr4
3B A2 24 94 movw 0x9424(rr2), rr4
3B E2 movw -(rr2), rr4
3C 14 movw rr2, rr4
40 02 01 cmp R1, R2
41 02 01 add R1, R2
42 02 01 sub R1, R2
43 02 01 adc R1, R2
44 02 01 sbc R1, R2
45 02 01 and R1, R2
46 02 01 or R1, R2
47 02 01 xor R1, R2
48 02 01 mov R1, R2
4A 04 02 movw RR2, RR4
4B 02 24 94 movw RR2, #0x9424
4C 04 02 mult RR2, R4
4D 94 02 mult RR2, #0x94
4E 07 01 bmov bf, R1, #7
4E 47 01 bmov R1, #7, bf
4F 07 01 bcmp bf, R1, #7
4F 47 01 band bf, R1, #7
4F 87 01 bor bf, R1, #7
4F C7 01 bxor bf, R1, #7
50 94 01 cmp R1, #0x94
51 94 01 add R1, #0x94
52 94 01 sub R1, #0x94
53 94 01 adc R1, #0x94
54 94 01 sbc R1, #0x94
55 94 01 and R1, #0x94
56 94 01 or R1, #0x94
57 94 01 xor R1, #0x94
58 94 01 mov R1, #0x94
5C 04 02 div RR2, RR4
5D 94 02 div RR2, #0x94
5E 01 94 02 movm R1, #0x94, R2
5F 01 94 24 movm R1, #0x94, #0x24
60 04 02 cmpw RR2, RR4
61 04 02 addw RR2, RR4
62 04 02 subw RR2, RR4
63 04 02 adcw RR2, RR4
64 04 02 sbcw RR2, RR4
65 04 02 andw RR2, RR4
66 04 02 orw RR2, RR4
67 04 02 xorw RR2, RR4
68 02 24 94 cmpw RR2, #0x9424
69 02 24 94 addw RR2, #0x9424
6A 02 24 94 subw RR2, #0x9424
6B 02 24 94 adcw RR2, #0x9424
6C 02 24 94 sbcw RR2, #0x9424
6D 02 24 94 andw RR2, #0x9424
6E 02 24 94 orw RR2, #0x9424
6F 02 24 94 xorw RR2, #0x9424
78 24 94 movw rr0, #0x9424
79 24 94 movw rr8, #0x9424
7A 24 94 movw rr2, #0x9424
7B 24 94 movw rr10, #0x9424
7C 24 94 movw rr4, #0x9424
7D 24 94 movw rr12, #0x9424
7E 24 94 movw rr6, #0x9424
7F 24 94 movw rr14, #0x9424
A0 01 bclr R1, #0
A1 01 bclr R1, #1
A2 01 bclr R1, #2
A3 01 bclr R1, #3
A4 01 bclr R1, #4
A5 01 bclr R1, #5
A6 01 bclr R1, #6
A7 01 bclr R1, #7
A8 01 bset R1, #0
A9 01 bset R1, #1
AA 01 bset R1, #2
AB 01 bset R1, #3
AC 01 bset R1, #4
AD 01 bset R1, #5
AE 01 bset R1, #6
AF 01 bset R1, #7
B0 01 mov r0, R1
B1 01 mov r1, R1
B2 01 mov r2, R1
B3 01 mov r3, R1
B4 01 mov r4, R1
B5 01 mov r5, R1
B6 01 mov r6, R1
B7 01 mov r7, R1
B8 01 mov R1, r0
B9 01 mov R1, r1
BA 01 mov R1, r2
BB 01 mov R1, r3
BC 01 mov R1, r4
BD 01 mov R1, r5
BE 01 mov R1, r6
BF 01 mov R1, r7
C0 94 mov r0, #0x94
C1 94 mov r1, #0x94
C2 94 mov r2, #0x94
C3 94 mov r3, #0x94
C4 94 mov r4, #0x94
C5 94 mov r5, #0x94
C6 94 mov r6, #0x94
C7 94 mov r7, #0x94
C8 94 mov ie0, #0x94
C9 94 mov ie1, #0x94
CA 94 mov ir0, #0x94
CB 94 mov ir1, #0x94
CC 94 mov p0, #0x94
CD 94 mov p1, #0x94
CE 94 mov p2, #0x94
CF 94 mov p3, #0x94
F0 stop
F1 halt
F8 ret
F9 iret
FA clrc
FB comc
FC setc
FD ei
FE di
FF nop
assemble() returned 0: OK
Clean up...

Note: The output binary has not been verified for accuracy.

==Development==

As85 is a fairly simple assembler. It doesn't use any sort of 'compiler-compiler' for lexical analysis. In fact, its lexical analysis is very specific to the sm8521 MCU.

The main loop (assemble() function, defined in [http://hg.kodewerx.org/as85/file/a90c47020126/src/asm.c#l315 asm.c]) does the input text parsing inline. (This should probably be moved out to a new source file.) After splitting a line into two pieces; op[0] containing the instruction, and op[1] containing its operands; a string comparison against op[0] is done over all supported instructions. If a match is found, the operands string is passed to a dynamically chosen function (from a [http://hg.kodewerx.org/as85/file/a90c47020126/src/inst.c#l156 function pointer table], indexed by the matched instruction). This function performs the lexical analysis required to decide which instruction we are trying to assemble.

Since the sm8521 is a CISC machine, its instruction set contains a number of different ways to assemble the same instruction mnemonic. For example, several different addressing modes for the ''mov'' instruction are shown above in the test3.asm output. The lexical analysis is the voodoo which picks the proper addressing mode and byte codes by analyzing the operands.

The lexical analyzing functions are defined in inst.c (following the function pointer table mentioned previously). The function handling the current instruction will test the operands string against a series of lexical patterns with the [http://hg.kodewerx.org/as85/file/a90c47020126/src/asm.c#l89 chk_pattern()] function (defined in asm.c, although this should probably be moved).

chk_pattern() uses a scanf-like formatting string, rather than a regular expression, which is more common in lexical analysis. Documentation for the formatting string can be found in [http://hg.kodewerx.org/as85/file/a90c47020126/src/inst.h#l29 inst.h]. With just a few pattern primitives, any of the sm8521's addressing modes can be matched, with the matching primitives output as part of an array. It makes good use of the format scanners defined in [http://hg.kodewerx.org/as85/file/a90c47020126/src/scan.c scan.c].

If chk_pattern() manages to find a match, the matching data may be further checked for validity on a per-context basis. Finally, the full instruction byte codes will be put together and returned to the assemble() loop. This is where the object code would be built. Currently the only thing that happens now is dumping the assembled instruction to stdout, in a debug build. [http://hg.kodewerx.org/as85/file/a90c47020126/src/asm.c#l464]

===Optimization Concerns===

Some optimization could be done within this lexical analysis process. The first improvement would be replacing the linear string comparison with a binary search tree. The second thing that would help in this immediate area would be replacing the string comparison itself with a hash comparison. The hash algorithm would have to be suitably small and fast enough to make much of a difference.

The next big optimization would be rearranging the chk_pattern() calls within each lexical analyzer to check the most likely patterns first. The best way to choose the best order is static analysis of sm8521 source code, which is obviously in very short supply. Accurate disassemblies of commercial Game.com games would be helpful to this end, however.

[[Category:Developer_Documentation]]

As85

2012-12-30T04:08:56Z

Parasyte: /* Download */ Update source link

as85 is a simple assembler for the Sharp sm8521; the same microcontroller used in the Tiger Game.com. Game.com was released in 1998 and had only a few games ever made for it. It also has not had any homebrew games made for it. as85 is an attempt to build an assembler that will help hackers write homebrew code that will run on Game.com hardware.

Documentation on the Game.com hardware is available at [http://gamecom.guruwork.de/ Game.commies].

==Download==

The source code is available at http://git.kodewerx.org/as85/src/

==Current Progress==

The current state of as85 is "almost usable, but not quite there yet." A number of bugs exist which need to be fixed before it can be used as a development tool:

* [http://bugzilla.kodewerx.org/show_bug.cgi?id=2 Bug 2]: Add support for jump/call/branch instructions
* [http://bugzilla.kodewerx.org/show_bug.cgi?id=3 Bug 3]: Output object code
* [http://bugzilla.kodewerx.org/show_bug.cgi?id=4 Bug 4]: Write a linker

I've also filed a bug about giving the project a better name [http://bugzilla.kodewerx.org/show_bug.cgi?id=5].

==Usage==

The program takes one argument; the file name of an sm8521 assembly file. The files in the [http://hg.kodewerx.org/as85/file/tip/examples /examples] directory are a good place to start.

* test.asm is an example of what the beginning of a Game.com program might look like; it contains a header, and some [random] instructions to give you an idea.
* test2.asm is for testing the integrity of the parser with complex strings.
* test3.asm lists all possible sm8521 instructions; for verifying the output binary is correct.

==Example Output==

The following command:
$ ./as85 ../examples/test3.asm

Produces the following output:
00 01 clr R1
01 01 neg R1
02 01 com R1
03 01 rr R1
04 01 rl R1
05 01 rrc R1
06 01 rlc R1
07 01 srl R1
08 01 inc R1
09 01 dec R1
0A 01 sra R1
0B 01 sll R1
0C 01 da R1
0D 01 swap R1
0E 01 push R1
0F 01 pop R1
10 0A cmp r1, r2
11 0A add r1, r2
12 0A sub r1, r2
13 0A adc r1, r2
14 0A sbc r1, r2
15 0A and r1, r2
16 0A or r1, r2
17 0A xor r1, r2
18 02 incw RR2
19 02 decw RR2
1A 08 clr @r1
1A 09 neg @r1
1A 0A com @r1
1A 0B rr @r1
1A 0C rl @r1
1A 0D rrc @r1
1A 0E rlc @r1
1A 0F srl @r1
1B 08 inc @r1
1B 09 dec @r1
1B 0A sra @r1
1B 0B sll @r1
1B 0C da @r1
1B 0D swap @r1
1B 0E push @r1
1B 0F pop @r1
1C 07 24 bclr 0xFF24, #7
1C 0F 94 bclr 0x94(r1), #7
1D 07 24 bset 0xFF24, #7
1D 0F 94 bset 0x94(r1), #7
1E 02 pushw RR2
1F 02 popw RR2
20 0A cmp r1, @r2
20 4A cmp r1, (r2)+
20 88 94 cmp r1, @0x94
20 8A 94 cmp r1, 0x94(r2)
20 CA cmp r1, -(r2)
21 0A add r1, @r2
21 4A add r1, (r2)+
21 88 94 add r1, @0x94
21 8A 94 add r1, 0x94(r2)
21 CA add r1, -(r2)
22 0A sub r1, @r2
22 4A sub r1, (r2)+
22 88 94 sub r1, @0x94
22 8A 94 sub r1, 0x94(r2)
22 CA sub r1, -(r2)
23 0A adc r1, @r2
23 4A adc r1, (r2)+
23 88 94 adc r1, @0x94
23 8A 94 adc r1, 0x94(r2)
23 CA adc r1, -(r2)
24 0A sbc r1, @r2
24 4A sbc r1, (r2)+
24 88 94 sbc r1, @0x94
24 8A 94 sbc r1, 0x94(r2)
24 CA sbc r1, -(r2)
25 0A and r1, @r2
25 4A and r1, (r2)+
25 88 94 and r1, @0x94
25 8A 94 and r1, 0x94(r2)
25 CA and r1, -(r2)
26 0A or r1, @r2
26 4A or r1, (r2)+
26 88 94 or r1, @0x94
26 8A 94 or r1, 0x94(r2)
26 CA or r1, -(r2)
27 0A xor r1, @r2
27 4A xor r1, (r2)+
27 88 94 xor r1, @0x94
27 8A 94 xor r1, 0x94(r2)
27 CA xor r1, -(r2)
28 0A mov r1, @r2
28 4A mov r1, (r2)+
28 88 94 mov r1, @0x94
28 8A 94 mov r1, 0x94(r2)
28 CA mov r1, -(r2)
29 11 mov @r1, r2
29 51 mov (r1)+, r2
29 90 94 mov @0x94, r2
29 91 94 mov 0x94(r1), r2
29 D1 mov -(r1), r2
2C 02 exts RR2
2E 94 mov ps0, #0x94
2F 94 01 btst R1, #0x94
30 09 cmp r1, @rr2
30 49 cmp r1, (rr2)+
30 88 24 94 cmp r1, @0x9424
30 89 24 94 cmp r1, 0x9424(rr2)
30 C9 cmp r1, -(rr2)
31 09 add r1, @rr2
31 49 add r1, (rr2)+
31 88 24 94 add r1, @0x9424
31 89 24 94 add r1, 0x9424(rr2)
31 C9 add r1, -(rr2)
32 09 sub r1, @rr2
32 49 sub r1, (rr2)+
32 88 24 94 sub r1, @0x9424
32 89 24 94 sub r1, 0x9424(rr2)
32 C9 sub r1, -(rr2)
33 09 adc r1, @rr2
33 49 adc r1, (rr2)+
33 88 24 94 adc r1, @0x9424
33 89 24 94 adc r1, 0x9424(rr2)
33 C9 adc r1, -(rr2)
34 09 sbc r1, @rr2
34 49 sbc r1, (rr2)+
34 88 24 94 sbc r1, @0x9424
34 89 24 94 sbc r1, 0x9424(rr2)
34 C9 sbc r1, -(rr2)
35 09 and r1, @rr2
35 49 and r1, (rr2)+
35 88 24 94 and r1, @0x9424
35 89 24 94 and r1, 0x9424(rr2)
35 C9 and r1, -(rr2)
36 09 or r1, @rr2
36 49 or r1, (rr2)+
36 88 24 94 or r1, @0x9424
36 89 24 94 or r1, 0x9424(rr2)
36 C9 or r1, -(rr2)
37 09 xor r1, @rr2
37 49 xor r1, (rr2)+
37 88 24 94 xor r1, @0x9424
37 89 24 94 xor r1, 0x9424(rr2)
37 C9 xor r1, -(rr2)
38 09 mov r1, @rr2
38 49 mov r1, (rr2)+
38 88 24 94 mov r1, @0x9424
38 89 24 94 mov r1, 0x9424(rr2)
38 C9 mov r1, -(rr2)
39 21 mov @rr2, r4
39 61 mov (rr2)+, r4
39 A0 24 94 mov @0x9424, r4
39 A1 24 94 mov 0x9424(rr2), r4
39 E1 mov -(rr2), r4
3A 14 movw rr2, @rr4
3A 54 movw rr2, (rr4)+
3A 90 24 94 movw rr2, @0x9424
3A 94 24 94 movw rr2, 0x9424(rr4)
3A D4 movw rr2, -(rr4)
3B 22 movw @rr2, rr4
3B 62 movw (rr2)+, rr4
3B A0 24 94 movw @0x9424, rr4
3B A2 24 94 movw 0x9424(rr2), rr4
3B E2 movw -(rr2), rr4
3C 14 movw rr2, rr4
40 02 01 cmp R1, R2
41 02 01 add R1, R2
42 02 01 sub R1, R2
43 02 01 adc R1, R2
44 02 01 sbc R1, R2
45 02 01 and R1, R2
46 02 01 or R1, R2
47 02 01 xor R1, R2
48 02 01 mov R1, R2
4A 04 02 movw RR2, RR4
4B 02 24 94 movw RR2, #0x9424
4C 04 02 mult RR2, R4
4D 94 02 mult RR2, #0x94
4E 07 01 bmov bf, R1, #7
4E 47 01 bmov R1, #7, bf
4F 07 01 bcmp bf, R1, #7
4F 47 01 band bf, R1, #7
4F 87 01 bor bf, R1, #7
4F C7 01 bxor bf, R1, #7
50 94 01 cmp R1, #0x94
51 94 01 add R1, #0x94
52 94 01 sub R1, #0x94
53 94 01 adc R1, #0x94
54 94 01 sbc R1, #0x94
55 94 01 and R1, #0x94
56 94 01 or R1, #0x94
57 94 01 xor R1, #0x94
58 94 01 mov R1, #0x94
5C 04 02 div RR2, RR4
5D 94 02 div RR2, #0x94
5E 01 94 02 movm R1, #0x94, R2
5F 01 94 24 movm R1, #0x94, #0x24
60 04 02 cmpw RR2, RR4
61 04 02 addw RR2, RR4
62 04 02 subw RR2, RR4
63 04 02 adcw RR2, RR4
64 04 02 sbcw RR2, RR4
65 04 02 andw RR2, RR4
66 04 02 orw RR2, RR4
67 04 02 xorw RR2, RR4
68 02 24 94 cmpw RR2, #0x9424
69 02 24 94 addw RR2, #0x9424
6A 02 24 94 subw RR2, #0x9424
6B 02 24 94 adcw RR2, #0x9424
6C 02 24 94 sbcw RR2, #0x9424
6D 02 24 94 andw RR2, #0x9424
6E 02 24 94 orw RR2, #0x9424
6F 02 24 94 xorw RR2, #0x9424
78 24 94 movw rr0, #0x9424
79 24 94 movw rr8, #0x9424
7A 24 94 movw rr2, #0x9424
7B 24 94 movw rr10, #0x9424
7C 24 94 movw rr4, #0x9424
7D 24 94 movw rr12, #0x9424
7E 24 94 movw rr6, #0x9424
7F 24 94 movw rr14, #0x9424
A0 01 bclr R1, #0
A1 01 bclr R1, #1
A2 01 bclr R1, #2
A3 01 bclr R1, #3
A4 01 bclr R1, #4
A5 01 bclr R1, #5
A6 01 bclr R1, #6
A7 01 bclr R1, #7
A8 01 bset R1, #0
A9 01 bset R1, #1
AA 01 bset R1, #2
AB 01 bset R1, #3
AC 01 bset R1, #4
AD 01 bset R1, #5
AE 01 bset R1, #6
AF 01 bset R1, #7
B0 01 mov r0, R1
B1 01 mov r1, R1
B2 01 mov r2, R1
B3 01 mov r3, R1
B4 01 mov r4, R1
B5 01 mov r5, R1
B6 01 mov r6, R1
B7 01 mov r7, R1
B8 01 mov R1, r0
B9 01 mov R1, r1
BA 01 mov R1, r2
BB 01 mov R1, r3
BC 01 mov R1, r4
BD 01 mov R1, r5
BE 01 mov R1, r6
BF 01 mov R1, r7
C0 94 mov r0, #0x94
C1 94 mov r1, #0x94
C2 94 mov r2, #0x94
C3 94 mov r3, #0x94
C4 94 mov r4, #0x94
C5 94 mov r5, #0x94
C6 94 mov r6, #0x94
C7 94 mov r7, #0x94
C8 94 mov ie0, #0x94
C9 94 mov ie1, #0x94
CA 94 mov ir0, #0x94
CB 94 mov ir1, #0x94
CC 94 mov p0, #0x94
CD 94 mov p1, #0x94
CE 94 mov p2, #0x94
CF 94 mov p3, #0x94
F0 stop
F1 halt
F8 ret
F9 iret
FA clrc
FB comc
FC setc
FD ei
FE di
FF nop
assemble() returned 0: OK
Clean up...

Note: The output binary has not been verified for accuracy.

==Development==

As85 is a fairly simple assembler. It doesn't use any sort of 'compiler-compiler' for lexical analysis. In fact, its lexical analysis is very specific to the sm8521 MCU.

The main loop (assemble() function, defined in [http://hg.kodewerx.org/as85/file/a90c47020126/src/asm.c#l315 asm.c]) does the input text parsing inline. (This should probably be moved out to a new source file.) After splitting a line into two pieces; op[0] containing the instruction, and op[1] containing its operands; a string comparison against op[0] is done over all supported instructions. If a match is found, the operands string is passed to a dynamically chosen function (from a [http://hg.kodewerx.org/as85/file/a90c47020126/src/inst.c#l156 function pointer table], indexed by the matched instruction). This function performs the lexical analysis required to decide which instruction we are trying to assemble.

Since the sm8521 is a CISC machine, its instruction set contains a number of different ways to assemble the same instruction mnemonic. For example, several different addressing modes for the ''mov'' instruction are shown above in the test3.asm output. The lexical analysis is the voodoo which picks the proper addressing mode and byte codes by analyzing the operands.

The lexical analyzing functions are defined in inst.c (following the function pointer table mentioned previously). The function handling the current instruction will test the operands string against a series of lexical patterns with the [http://hg.kodewerx.org/as85/file/a90c47020126/src/asm.c#l89 chk_pattern()] function (defined in asm.c, although this should probably be moved).

chk_pattern() uses a scanf-like formatting string, rather than a regular expression, which is more common in lexical analysis. Documentation for the formatting string can be found in [http://hg.kodewerx.org/as85/file/a90c47020126/src/inst.h#l29 inst.h]. With just a few pattern primitives, any of the sm8521's addressing modes can be matched, with the matching primitives output as part of an array. It makes good use of the format scanners defined in [http://hg.kodewerx.org/as85/file/a90c47020126/src/scan.c scan.c].

If chk_pattern() manages to find a match, the matching data may be further checked for validity on a per-context basis. Finally, the full instruction byte codes will be put together and returned to the assemble() loop. This is where the object code would be built. Currently the only thing that happens now is dumping the assembled instruction to stdout, in a debug build. [http://hg.kodewerx.org/as85/file/a90c47020126/src/asm.c#l464]

===Optimization Concerns===

Some optimization could be done within this lexical analysis process. The first improvement would be replacing the linear string comparison with a binary search tree. The second thing that would help in this immediate area would be replacing the string comparison itself with a hash comparison. The hash algorithm would have to be suitably small and fast enough to make much of a difference.

The next big optimization would be rearranging the chk_pattern() calls within each lexical analyzer to check the most likely patterns first. The best way to choose the best order is static analysis of sm8521 source code, which is obviously in very short supply. Accurate disassemblies of commercial Game.com games would be helpful to this end, however.

[[Category:Developer_Documentation]]

File:Syndrome2.png

2011-07-02T08:49:10Z

Parasyte: uploaded a new version of "File:Syndrome2.png"

First version of Syndrome, with empty sidebar; C++, MFC.

File:Syndrome.png

2011-07-02T08:48:54Z

Parasyte: uploaded a new version of "File:Syndrome.png"

First version of syndrome; C++, MFC.

File:Syn4.png

2011-07-02T08:48:40Z

Parasyte: uploaded a new version of "File:Syn4.png"