Goal
To describe a method for parsing text, along with other conceptually similar operations, in a consistent, non-recursive way.
Contents
- Goal
- Contents
- Abstract
- Introduction
- The State
- Commands
- Scopes
- Case study: JSON
- Performance
- Errors
- Conclusion
- Appendix A: Core
- Appendix B: Installation
- Appendix C: Example
Abstract
A method for text parsing is presented that makes use of a state construct that accepts individual characters of text as commands and is modified based on a set of rules describing a specific text format. Unlike other methods, the core process of parsing and the ruleset describing a format are logically separated. Only the ruleset must be changed to support a new format. This approach can also be used for event-driven application states such as a text editor. Another unique aspect of this approach is the lack of recursive processing of substructures. Processing is done linearly for each command received, making efficient use of available memory, although any rule can be arbitrarily complex.
Introduction
Data models are often represented as text or binary data to allow wide support of storage and transmission. The simpler the representation of the data, the more likely it is to be portable. The process of parsing text is meant to recover the data model to allow more complex and efficient manipulations. For example, CSV data is meant to allow a table to be built. Once this is done, questions such as "what is the sum of the values in a column containing price data?" are much easier to answer. It would be inefficient to carry out table operations on the raw text of a CSV representation.
In a typical text parser, text is converted into an abstraction using "forward-looking" templates, such as regular expressions. They require enough data to be buffered for a template to produce a match before a decision can be made about updating the state. For example, a parser for JSON very often will have a series of counters to allow the beginning and end of sections bounded by brackets to be marked. This leads to loosely organised sets of implicit rules making up the parsing code. Much of the code is then specific to a particular format and the logical similarities of different parsing processes cannot be abstracted.
Another common pattern is to operate recursively, locking up memory by storing a parent structure while its substructures are processed. This limits the ability of the parsing program to restart or recover from errors. In a world where data is exchanged in discrete chunks that may or may not contain a complete abstract data representation, this is not a helpful pattern.
The goal of the current work is to find a way of logically separating the core parsing algorithm from the set of rules that apply to a text format. In this way, the rules can be expressed as a single item and removed from the core code entirely. Additionally, the parsing method should allow a representation to be held in a pending state if the data it receives does not lead to a complete parse.
The code provided is written in Common Lisp. The compiler used was SBCL 1.5.4 on OSX 10.13.6.
The State
A string is a linear set of characters, but the data that it represents could be a point cloud of aircraft locations, a complex molecule, or a family tree. In each case, a hierarchy of objects can be created that allows the desired manipulations to be executed. To use JSON as an example, the desired data model is a tree of items of several different types: objects with keys and values, arrays, strings, integers, boolean values, a null value, and numbers.
This abstraction is known in the current work as "the State", which is a single object that can only be modified by methods that are pure functions of the existing state and a single incoming command character. The core process governs the relationship between the incoming data and the commands triggered on the State. The set of commands available is the ruleset governing the format of the data abstraction.
To use an analogy from biology, a polymerase replicates a strand of RNA by stepping along an existing strand (the data), reacting to each pair of bases it encounters (the commands), and building a new strand (the State) accordingly.
Commands
In the case of text parsing, individual characters can be interpreted as commands, but more generally, a command could be a user input in an application or even a chunk of binary data with a fixed size. The effect of the same command might differ based on the commands the State has already received. To use the task of parsing a number from a set of characters as an example, each digit of the number is a command, and based on the rules the State uses to interpret these commands, the outcome could differ substantially.
Starting from a string of text "121683", parsing the value can be approached in two ways. Either the parser starts at one end or the other. Using the same process, but different rules, the same outcome can be achieved. The rule when starting from the right is "interpreting the command as an integer, multiply the next command by the power of ten equal to the number of commands processed and add it to the value in the State. Afterwards, add one to the count.". The State in this case is an object of the form:
State {
value: 0
count: 0
}
Each step would yield:
3 -> 3*10^0 + 0 = 3 -> { value: 3, count: 1 }
8 -> 8*10^1 + 3 = 83 -> { value: 83, count: 2 }
6 -> 6*10^2 + 83 = 683 -> { value: 683, count: 3 }
1 -> 1*10^3 + 683 = 1683 -> { value: 1683, count: 4 }
2 -> 2*10^4 + 1683 = 21683 -> { value: 21683, count: 5 }
1 -> 1*10^5 + 21683 = 121683 -> { value: 121683, count: 6 }
8 -> 8*10^1 + 3 = 83 -> { value: 83, count: 2 }
6 -> 6*10^2 + 83 = 683 -> { value: 683, count: 3 }
1 -> 1*10^3 + 683 = 1683 -> { value: 1683, count: 4 }
2 -> 2*10^4 + 1683 = 21683 -> { value: 21683, count: 5 }
1 -> 1*10^5 + 21683 = 121683 -> { value: 121683, count: 6 }
The rule for receiving commands from the left would be "interpreting the command as an integer, multiply the value in the state by 10 and add it to the next command". The state is:
State {
value: 0
}
The steps would be:
1 -> 10*0 + 1 = 1
2 -> 10*1 + 2 = 12
1 -> 10*12 + 1 = 121
6 -> 10*121 + 6 = 1216
8 -> 10*1216 + 8 = 12168
3 -> 10*12168 + 3 = 121683
2 -> 10*1 + 2 = 12
1 -> 10*12 + 1 = 121
6 -> 10*121 + 6 = 1216
8 -> 10*1216 + 8 = 12168
3 -> 10*12168 + 3 = 121683
In the second example, there is no need to count the number of times a command has been accepted. The State can, of course, store any variables that allow it to apply its rules correctly. In the case of these two examples, only one rule is needed that is either independent of the State or is a pure function of stored values. In more complex text parsing, there could be many sets of rules that are applied in response to different configurations of the State.
Scopes
An effective way to encapsulate the interior of the State and the functions that can be applied to it is through the use of scopes, or objects with a common set of parameters and methods that can describe nodes in a tree or a web of data. In the case of numbers above, using the first strategy, the State contains only one type of scope:
(defclass number-scope () |
((value |
:initform 0 |
:accessor value) |
(count |
:initform 0 |
:accessor count))) |
(defmethod accept-number ((number-scope number-scope) number) |
(incf (value number-scope) |
(* |
number |
(expt (count number-scope) 10))) |
(incf (count number-scope))) |
(defmethod execute-command ((number-scope number-scope) command) |
(case command |
((#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9) |
(accept-number number-scope command)))) |
The
(execute-command)
method can be implemented by every scope class in the format specification, although in this case, there is only one scope. The scope can be initialised and fed with commands as follows:(defun run-parse (stream &key existing-scope) |
(let |
((command (read-char stream nil nil))) ; <--- read a single character |
(when (or (not command) *error*) ; <--- exit if stream is empty |
(return-from run-parse *active-scope*)) |
(execute-command *active-scope* command) ; <--- run (execute-command) method |
(run-parse stream :existing-scope *active-scope*))) |
(defun parse (stream &key existing-scope) |
(let* |
((*active-scope* ; <--- initialise the active scope |
(or |
existing-scope ; <--- allow for restarts using an incomplete parse |
(make-instance (quote number-scope)))) ; <--- new number scope |
(*root-scope* *active-scope*)) ; <--- keep track of first scope |
(run-parse stream :existing-scope *active-scope*))) ; <--- begin parsing |
Even if the program exits early due to an error, the
*active-scope*
variable stores the last scope to be processed, including the error that has been attached to it, to allow parsing to restart based on the type of error or any other custom behaviour. A method such as
(has-error-p)
could be enough to correctly respond to such an outcome.The program can keep a record of the most recently active scope so that it can continue to receive commands or choose to relinquish its focus in response to a command. This complexity is not needed for the numbers example, but it allows simple and complex formats to be handled in a consistent manner.
Case Study: Json
The JSON format is a simple text format summarised here:
http://www.json.org/. It allows for portable, human-readable text encoding of arbitrarily complex data. It does this by organising data into a tree hierarchy with several different types of nodes. This tree structure runs deeper than it might first appear, governing how numbers are formatted, for example. The parsing of JSON can be achieved with nineteen scope classes. Note that this is not the only possible configuration.
json-scope
json-list-scope
json-list-item-scope
json-structure-scope
json-structure-item-scope
json-structure-item-key-scope
json-structure-item-value-scope
json-false-scope
json-true-scope
json-null-scope
json-string-scope
json-character-scope
json-escape-scope
json-unicode-scope
json-number-scope
json-zero-scope
json-integer-scope
json-decimal-scope
json-exponent-scope
json-list-scope
json-list-item-scope
json-structure-scope
json-structure-item-scope
json-structure-item-key-scope
json-structure-item-value-scope
json-false-scope
json-true-scope
json-null-scope
json-string-scope
json-character-scope
json-escape-scope
json-unicode-scope
json-number-scope
json-zero-scope
json-integer-scope
json-decimal-scope
json-exponent-scope
JSON Numbers
A good example of the parser in action would be the representation of a number (indentation signifies a parent-child relationship):
12.34E10
json-scope
json-number-scope
json-integer-scope: (12)
json-decimal-scope: (34)
json-exponent-scope
json-number-scope
json-integer-scope: (10)
Note that the exponent scope is simply a wrapper for another number scope, allowing the same rules to be used. It is also important to note that the characters "." and "E" are not represented, and serve only to indicate that the number is transitioning from one type of child to the other. Most characters in JSON will trigger structural changes rather than the addition of data to the state. Once the data is represented in this form, it can easily be "rendered" into its final numerical form within the program for practical purposes.
Now, to step through this parsing one command at a time:
1. Start with an empty
json-scope
(active =
json-scope
)2. "1" -> open
json-number-scope
on
json-scope
and run command again (active =
json-number-scope
)3. "1" -> open
json-integer-scope
on
json-number-scope
and run command again (active =
json-integer-scope
)4. "1" -> append "1" to
json-integer-scope
(active =
json-integer-scope
)5. "2" -> append "2" to
json-integer-scope
(active =
json-integer-scope
)6. "." -> open
json-decimal-scope
on parent of
json-integer-scope
(active =
json-decimal-scope
)7. "3" -> append "3" to
json-decimal-scope
(active =
json-decimal-scope
)8. "4" -> append "4" to
json-decimal-scope
(active =
json-decimal-scope
)9. "E" -> open
json-exponent-scope
on parent of
json-decimal-scope
(active =
json-exponent-scope
)10. "1" -> open
json-number-scope
on
json-exponent-scope
and run command again (active =
json-number-scope
)11. "1" -> open
json-integer-scope
on
json-number-scope
and run command again (active =
json-integer-scope
)12. "1" -> append "1" to
json-integer-scope
(active =
json-integer-scope
)13. "0" -> append "0" to
json-integer-scope
(active =
json-integer-scope
)14. "\n" -> set active to be parent of outermost
json-number-scope
and run command again (active =
json-scope
)15. "\n" -> do nothing
Even within a structure seemingly as simple as a number, there is plenty of complexity involved in transitioning from parent scopes to their children. For example, the response to the "1" character at both the
json-scope
and
json-number-scope
is to open a child and run the command there instead. This passes the focus to the child so the following command will also be applied there. The response to "." at the
json-integer-scope
is to open a child under its parent (
json-number-scope
), transitioning to a new scope and relinquishing control of the next command.A key difference between the integer scope and the decimal scope is their treatment of the "0" character. An integer cannot begin with a zero, but a decimal can. If the next character is a zero, the newly opened decimal scope will respond correctly. Note that this can also be achieved by simply recording the position of the decimal point as a variable in the number scope rather than dealing with different rulesets for different classes. The parsing protocol allows for this flexibility, but the hierarchy of scopes and rules chosen can affect performance.
The repetition of commands allows for consistent rules for each type of scope, although this can be optimised further. The reason to forego this optimisation is to avoid the permutations that extend from a single command given the specific layout of the hierarchy.
Processing can be halted at any step, and the partially parsed State will respond to the next command as it would if the processing had continued normally. That is to say, the processing is not "waiting" for a particular character to complete a section or for a pattern to emerge. This makes it very flexible in an environment where data can be received in a staggered or unpredictable way. It also means that the parsing does not happen recursively, tying up memory while substructures are processed.
JSON Structure
While a structure is normally thought of as a hash table, it is actually represented as a list with a specific format for each item. Given the following example:
{
"key1": "value1",
"key2": [
"value2"
]
}
Each item in the structure list has the form:
"STRING_KEY": ANY_VALUE_ACCEPTED_BY_TOP_LEVEL
In this way, the format allows for recursive data structures. The example can be represented as follows:
json-scope
json-structure-scope
json-structure-item-scope
json-structure-item-key-scope
json-string-scope
json-character-scope: (key1)
json-structure-item-value-scope
json-string-scope
json-character-scope: (value1)
json-structure-item-scope
json-structure-item-key-scope
json-string-scope
json-character-scope: (key2)
json-structure-item-value-scope
json-list-scope
json-list-item-scope
json-string-scope
json-character-scope: (value2)
Each item has two children: a key, which can only contain a string, and a value. The value can contain any scope that is valid at the top level. In the case of the second item, it contains a list.
Two important rules for this format:
1. The ":" character received by the
json-structure-item-key-scope
(since it's
json-string-scope
child has already been closed by the "\"" character) transitions to the next child of the
json-structure-item-scope
(
json-structure-item-value-scope
).2. The "," character received by the
json-structure-item-value-scope
transitions to the next child of its grandparent
json-structure-scope
. Importantly, each child that the value scope might have has a character ("]" "}", "\"") to close itself EXCEPT numbers. This means that the individual number scopes (integer, decimal, and exponent) need to have a way of closing the value scope themselves. This is due to the lack of delimiting character for numbers. This is the price of semantic clarity. On top of this, the
json-structure-item-value-scope
needs to be able to close its grand parent upon receiving the "}" character since trailing commas are not supported by the specification. This also applies to list item scopes.Performance
Performance will depend greatly on the complexity of the text format and the particular hierarchy of the scope classes, but there are some principles that can lead to greater performance.
1. As far as possible, minimise function calls. If a simple parent operation can be done at the child level, then it should be done rather than calling a parent method. Although it might be tempting to find the most elegant abstraction of all the scope classes to allow inheritance and composition of rules, this should also be minimised for the same reason.
2. The use of the active scope pointer prevents the need for recursive active scope checks.
3. Using (cons) to add the beginning of a list of child or content rather than (append) can greatly increase performance. This results in lists being built backwards, but they can be reversed when the parsing process is complete, rather than during operation.
YES:
(setf (children scope) (cons new-child (children scope)))
NO:
(setf (children scope) (append (children scope) (list new-child)))
This is due to the linked list structure within Common Lisp. This will obviously depend on the language of implementation.
4. Minimise calls needed to lookup the method to run on scope for a command. The best way to do this is to use (case). A simple example is the ruleset for the
false-scope
(defclass false-scope (keyword-scope)
((keyword-value
:initform *json-false*
:reader keyword-value)))
(defmethod execute-command ((false-scope false-scope) command)
(case command
((#\f #\a #\l #\s #\e) (append-content false-scope command))))
The (case) macro can handle a list of characters that all run the same method. For more complex rulesets, the (case) macro can contain more options and a default case that calls (execute-command) on an inherited class.
Errors
To allow for customisable error response, errors do not kill the parsing program by default. An error is added to the active scope if the command fails. The simplest failure that can occur is for the scope to lack an appropriate method for a specific command. For example, if the command "a" followed "1", which had opened a number and integer scope, the active integer scope would not have a method to run for the "a" command. This will depend on the format specification. Sometimes, it might be useful to accept a command even if it will fail to point out a serious formatting error such as a trailing comma in a list, which would otherwise be difficult to debug. Error objects are appended to a list of errors on the scope object. Error objects can contain any parameters needed to indicate the nature of the problem, along with a method to generate a string representation of the error for convenience.
Conclusion
A parsing method has been presented that allows for text formats to be abstracted in terms of a state that responds to individual characters as commands that modify or append to it. Using number parsing as an example, a chain of single commands can produce a hierarchy that can both be used to recover the original input and be rendered simply into a usable value. There are several notable advantages to this approach.
Firstly, the consistent scope model can allow a greater degree of flexibility in the types of data custom schemes can accept. For example, an XML parser can allow for embedded JSON parsing by allowing a
json-scope
to be created as a child of
xml-children-scope
. Since the underlying scope construct implements the same scope interface, they are interchangeable. Another example might be the JSX or TypeScript extensions to the JavaScript format. These are simple modifications of or additions to existing rules, rather than a completely separate protocol.Secondly, the output, a scope, can be manipulated in more complex ways than native data types. For JSON, a
search
or
get
method can be implemented on a subclass of
scope
that allows for a path to be returned from nested data. Other utilities such as format-aware diffing can be achieved far more easily since the scope maps exactly to the format.Lastly, given the scope structure, the original input can be recovered very faithfully. Some exceptions include whitespace or the "e/E" characters in JSON. These have no semantic meaning, so there is no reason to include differentiation in the format ruleset. The parsing process can essentially be run backwards, starting from the last child of the last scope and working backwards, writing one character at a time to a stream. This offers the same flat, non-recursive behaviour as the input, making efficient use of available memory.
These advantages come at the cost of some performance, which will depend on the precise nature of the format and the choice of implementation. The performance of this approach might not reach the speed of format-specific implementations that do not make use of scope constructs and whose sole purpose is to immediately yield usable data.
Appendix A: Core
The core code consists of several scope classes, an error class, and the root parse method, which accepts a stream and calls the
(execute-command)
method on the
*active-scope*
object recursively. Opening and closing scopes control how the
*active-scope*
pointer moves.Any extension to this core will involve the creation of scope classes which implement their own
(execute-command)
method.(asdf:defsystem :parse | |
:description "" | |
:author "" | |
:license "" | |
:version "" | |
:serial t | |
:components | |
((:file "packages") | |
(:file "error") | |
(:file "scope") | |
(:file "parse"))) |
(defpackage :parse | |
(:use :cl) | |
(:export | |
*root-scope* | |
*active-scope* | |
scope-error | |
scope | |
scope-name | |
parent | |
grandparent | |
great-grandparent | |
errors | |
empty-p | |
has-errors-p | |
add-error | |
close-scope | |
execute-command | |
print-scope | |
content-scope | |
content | |
append-content | |
container-scope | |
children | |
open-child | |
keyword-scope | |
parse)) |
(in-package :parse) | |
(defclass scope-error () | |
((value | |
:initarg :value | |
:initform nil | |
:accessor value))) | |
(defclass no-function-error (scope-error) | |
((name | |
:initarg :name | |
:reader name) | |
(command | |
:initarg :command | |
:reader command))) | |
(defmethod initialize-instance | |
:after ((no-function-error no-function-error) &key) | |
(setf | |
(value no-function-error) | |
(format nil "No function for command \"~A\" in scope ~A" | |
(command no-function-error) | |
(name no-function-error)))) | |
(defun make-no-function-error (name command) | |
(make-instance (quote no-function-error) | |
:name name | |
:command command)) |
(in-package :parse) | |
(defparameter *error* nil) | |
(defparameter *root-scope* nil | |
"The first scope") | |
(defparameter *active-scope* nil | |
"The scope that will receive the next command") | |
(defclass scope () | |
((parent | |
:initform nil | |
:initarg :parent | |
:reader parent) | |
(errors | |
:initform nil | |
:accessor errors))) | |
(defmethod scope-name ((scope scope)) | |
(class-name (class-of scope))) | |
(defmethod empty-p ((scope scope)) nil) | |
(defmethod grandparent ((scope scope)) | |
(parent (parent scope))) | |
(defmethod great-grandparent ((scope scope)) | |
(parent (grandparent scope))) | |
(defmethod has-errors-p ((scope scope)) | |
(not (null (errors scope)))) | |
(defmethod add-error ((scope scope) (scope-error scope-error)) | |
(setf (errors scope) | |
(cons scope-error (errors scope))) | |
(setf *error* t) | |
scope) | |
(defmethod close-scope ((scope scope) &optional command) | |
(setf *active-scope* (parent scope)) | |
scope) | |
(defmethod execute-command ((scope scope) command) | |
(case command | |
(t (add-error scope (make-no-function-error (scope-name scope) command))))) | |
(defun print-scope-errors (errors level) | |
(when errors | |
(let | |
((scope-error (first errors))) | |
(format t "~%~A error: ~A" | |
(make-string (* 2 level) :initial-element #\-) | |
(value scope-error))) | |
(print-scope-errors (rest errors) level))) | |
(defmethod print-scope ((scope scope) &key (level 0)) | |
(format t "~%~A ~A" | |
(make-string (* 2 level) :initial-element #\-) | |
(scope-name scope)) | |
(when (has-errors-p scope) | |
(let* | |
((errors (errors scope)) | |
(number (length errors))) | |
(format t ": (~A error~A)" | |
number | |
(if (= 1 number) "" "s")) | |
(print-scope-errors (reverse errors) (+ 1 level))))) | |
(defclass content-scope (scope) | |
((content | |
:initform nil | |
:accessor content))) | |
(defmethod empty-p ((content-scope content-scope)) | |
(null (content content-scope))) | |
(defmethod append-content ((content-scope content-scope) char) | |
(setf (content content-scope) (cons char (content content-scope)))) | |
(defmethod print-scope ((content-scope content-scope) &key) | |
(call-next-method) | |
(unless (has-errors-p content-scope) | |
(format t ": ~A" (reverse (content content-scope))))) | |
(defclass container-scope (scope) | |
((children | |
:initform nil | |
:accessor children))) | |
(defmethod empty-p ((container-scope container-scope)) | |
(null (children container-scope))) | |
(defmethod open-child ((container-scope container-scope) &optional scope-type) | |
(let | |
((new-child | |
(make-instance (or scope-type (quote scope)) :parent container-scope))) | |
(setf (children container-scope) (cons new-child (children container-scope))) | |
(setf *active-scope* new-child) | |
new-child)) | |
(defun print-child-scopes (child-scopes level) | |
(when child-scopes | |
(let | |
((child (first child-scopes))) | |
(print-scope child :level (+ 1 level)) | |
(print-child-scopes (rest child-scopes) level)))) | |
(defmethod print-scope ((container-scope container-scope) &key (level 0)) | |
(call-next-method) | |
(unless (has-errors-p container-scope) | |
(print-child-scopes (reverse (children container-scope)) level))) |
(in-package :parse) | |
(defun run-parse (stream &key existing-scope) | |
(let | |
((command (read-char stream nil nil))) | |
(when (or (not command) *error*) | |
(return-from run-parse *active-scope*)) | |
(execute-command *active-scope* command) | |
(run-parse stream :existing-scope *active-scope*))) | |
(defun parse (stream &key existing-scope) | |
(let* | |
((*active-scope* | |
(or | |
existing-scope | |
(make-instance (quote scope)))) | |
(*root-scope* *active-scope*)) | |
(run-parse stream :existing-scope *active-scope*))) |
parse.lisp
: when the data in the stream is exhausted, execution will stop. The
*root-scope*
pointer stores the first scope to start the process. If this pointer equals the
*active-scope*
pointer, the execution has successfully executed all commands, yielding a complete parse. If this is not the case, either an error has stopped the parse due to some custom behaviour, or the parse is incomplete. The parse can be resumed by calling the (parse) method on the existing
*active-scope*
.Appendix B: Installation
Install SBCL
The SBCL (Steel Bank Common Lisp) compiler is an open source Lisp compiler that supports many common operating systems. On OSX, SBCL can be installed with Homebrew:
~$ brew install sbcl
Download Quicklisp
Quicklisp allows for easy Lisp package management. It can be found here:
http://www.quicklisp.org/beta
1. Download the file
quicklisp.lisp
.2. Open an SBCL shell and load the file:
~$ sbcl --load quicklisp.lisp
This will initialise the quicklisp directory in the home folder in order to store downloaded lisp packages. Quicklisp uses the ASDF system under the hood and adds remote fetching and package resolution.
3. Modify quicklisp setup script to include parse directory (place this at the end of the file):
* (push "~/path/to/parse/" ql:*local-project-directories*)
4. The parse package can now be used in any other package like so:
(load "~/quicklisp/setup.lisp")
(ql:quickload :parse)
; use any symbol exported by parse here
(ql:quickload :parse)
; use any symbol exported by parse here
5. Ensure the Quicklisp
system-index.txt
file updates. This should happen automatically when Quicklisp detects that the timestamp of the local project directory has changed:
~$ sbcl --load quicklisp.lisp
* (load "~/quicklisp/setup.lisp")
* (ql:register-local-projects)
* (ql:register-local-projects)
Appendix C: Example
Following is the number parsing example from sections 5 and 6. The version presented here reads characters from the left side of the number, since this is the order offered by the file stream. This software assumes that the parse package is already available via Quicklisp.
parse-number.asd
(asdf:defsystem :parse-number | |
:description "" | |
:author "" | |
:license "" | |
:version "" | |
:depends-on | |
(:parse) | |
:serial t | |
:components | |
((:file "parse-number"))) |
parse-number.lisp
(defpackage :parse-number | |
(:use :cl :parse) | |
(:export | |
parse-number | |
print-scope)) | |
(in-package :parse-number) | |
(defclass number-scope (scope) | |
((value | |
:initform 0 | |
:accessor value) | |
(command-count | |
:initform 0 | |
:accessor command-count))) | |
(defmethod print-scope ((number-scope number-scope) &key (level 0)) | |
(format t "~A~%" (value number-scope))) | |
(defmethod accept-number ((number-scope number-scope) number) | |
(setf (value number-scope) | |
(+ | |
(* 10 (value number-scope)) | |
(case number | |
(#\0 0) | |
(#\1 1) | |
(#\2 2) | |
(#\3 3) | |
(#\4 4) | |
(#\5 5) | |
(#\6 6) | |
(#\7 7) | |
(#\8 8) | |
(#\9 9))))) | |
(defmethod execute-command ((number-scope number-scope) command) | |
(case command | |
((#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9) | |
(accept-number number-scope command)))) | |
(defun parse-number (stream &key existing-scope) | |
(let | |
((existing-scope | |
(or | |
existing-scope | |
(make-instance (quote number-scope))))) | |
(parse stream :existing-scope existing-scope))) |
test.lisp
(load "~/quicklisp/setup.lisp") | |
(ql:quickload :parse-number) | |
(defpackage :test-parse-number | |
(:use :cl :parse-number)) | |
(in-package :test-parse-number) | |
(defun parse-file (path) | |
(with-open-file (stream path) | |
(parse-number stream))) | |
(let | |
((scope (parse-file "./test.txt"))) | |
(print-scope scope)) |
test.txt
1234567890 |
Installation
The installation of this example package is similar to the approach for the parse package, assuming you have already completed the steps to set up Quicklisp.
1. Modify quicklisp setup script to include parse-number directory. Alternatively, place both directories under a common directory which is used in the Quicklisp
setup.lisp
file.:
* (push "~/path/to/parse-number/" ql:*local-project-directories*)
2. The parse-number package can now be used in any other package like so:
(load "~/quicklisp/setup.lisp")
(ql:quickload :parse-number)
(ql:quickload :parse-number)
3. Run the
test.lisp
file.
~$ cd path/to/parse-number/
~$ sbcl --script test.lisp
~$ sbcl --script test.lisp
To load "parse-number":
Load 1 ASDF system:
parse-number
; Loading "parse-number"
[package parse-number]
1234567890