Pattern matching & Text analysis
Haddock
Within Colbert we extended SQL with a powerful domain specific language for analysing and mutating text.
With this language, called Haddock, any analysis or manipulation of text is possible.
What are the key benefits ?
By integrating Haddock into SQL you can solve complex text problems within your DBMS by simply writing a SQL statement.
How does it work ?
Haddock treats text as a string of characters in which you can freely move right and left.
Haddock keeps track of the position in the string by a certain register.
There are more registers for copying text, counting etc.
Haddock support a lot of one letter operators that perform test, find, substitute, move, delete or count operations.
With this you can perform almost any operation on text.
Combined with the simplicity of function definition in Colbert, you can now very easily create any string function.
Example
The challenge: Get the second number from a text:
‘Hank 12 Main road 9788 Married 23987′ :: {G’0-9’ M’0-9’*} 2 Q@
Explanation:
::
G’0-9′
M’0-9′
*
{ … } 2
Q@
Make the string a Haddock object
Go to the first character between 0 and 9
Move current character to the buffer register if it is between 0 and 9
Perform last operation as often as possible
Perform enclosed operations 2 times
Quit Haddock yielding the buffer register as a result.
You can also define this as a function and call it subsequently with some parameters or use it in a SQL statement.
Create a function
Call GetNum to get second number
Use the function in SQL
Create func GetNum as Source :: {G’0-9′ M’0-9’*} Nth Q@
GetNum(‘Hank 12 Main road 9788 Married 23987’, 2)
Select GetNum(Info, 2) as Zip from UserInfo
© Colbert Databases BV 2019