Imagine you need a stream of data from any source available on the web, be it currency prices, temperature readouts, or any other stream of data - in our example, it will be a stream of timestamps from NYC.

Let's get started. I'll show you how to scrape timestamp data, but the mechanism works for any stream from a website. Feel free to let us know what you might be using this for in the comments!

1. Install Scramjet Transform Hub and the CLI

We'll be using Scramjet CLI to pack and send the programs to the server, so go ahead and install it.


_10
npm i -g @scramjet/cli

In order to run our sample, you'll need to install STH and start it. You can install Transform Hub on the same machine or on a remote one that you have network access to. When that's done, start it there.


_10
# This will download and install STH
_10
npm i -g @scramjet/sth
_10
_10
# This will start it
_10
sth
_10
_10
# See sth --help for more info.

If you installed STH on a remote machine using the ip a command to see what IP's the machine has.

2. Get the sample from Scramjet's repo

Clone the repo locally, get into the samples directory and find scraping. We have to install our local dependencies as well.

Copy the following lines to your terminal:


_10
# This downloads the repo to your computer
_10
git clone https://github.com/scramjetorg/scramjet-cloud-docs.git
_10
_10
# This will enter the directory with the sample
_10
cd scramjet-cloud-docs/samples/crypto-prices
_10
_10
# This will install the local dependencies
_10
npm install

3. Check out the sequence code

Now let's open up an editor to get a better understanding of what's going on inside.


_10
# If you're viewing the file from within VSCode use this:
_10
code index.ts
_10
_10
# Otherwise use either vi, nano or just write editor to use the default one
_10
editor index.ts

We're exporting an async generator function with the default arguments. You can think of it as just a function that returns multiple values over time. It's typed as a ReadableApp, which doesn't use the input and uses the three other parameters - the currency strings and the interval.


_10
const app: ReadableApp<string> = async function* (
_10
_stream,
_10
currency = "BTC",
_10
baseCurrency = "USD",
_10
interval = 3000
_10
) {

Note: Using the types is not required, but it makes it easier for you to write code that will work with STH.

As you can see inside, we run an infinite loop and fetch the data from the API based on the arguments. At the beginning of the loop we also set up a timer, it'll come in handy a bit later. Notice we're not awaiting it yet.


_10
while (true) {
_10
const ref = defer(interval);
_10
const data = await fetch(`https://api.coinbase.com/v2/prices/${currency}-${baseCurrency}/spot`);

Then we yield the result with a new line character appended for prettier output.


_10
yield JSON.stringify(await data.json()) + "\r\n";

Finally, we wait for the previously set timer. Thanks to this, we'll be making requests every 3 seconds, not with a space of 3 seconds in between the requests.

4. Let's build and run.

Now when we have an idea of what's going on under the hood. Let's run our application.

We'll create a standalone package for our sequence. We can get started with building our source code or executing the typescript compiler (this step isn't necessary for plain javascript modules)


_10
# this builds the code
_10
tsc -p tsconfig.json
_10
_10
# this copies the package.json file to the output directory
_10
# note: the directory depends on your tsconfig.json
_10
cp -r package.json dist/
_10
_10
# this will install dependencies in the dist folder
_10
(cd dist && npm i --only=production)

The sample has a handy npm script that does all of the above in one simple command:


_10
npm run build

Now, let's use SI CLI to create a package which will create a tar.gz compressed package.


_10
si pack -o crypto-sample.tar.gz dist/

And we can now send it as a sequence like this:


_10
# You can provide the name
_10
si seq send crypto-sample.tar.gz
_10
_10
# we can use the "-" sign to call the last one
_10
si seq send -

Now we have an ID of our newly created sequence. Nothing runs yet but let's change that! Let's start our sequence. To do so, we need to pass ID and our arguments: URL and currency symbols. We won't pass the interval so, it will default to 3000 milliseconds as in the code.


_10
si seq start - BTC USD

Now we can connect to the sequence output and see how it shows up:


_10
si seq output -

In the terminal you'll see the current exchange rate every 3 seconds for every single line.

Easy peasy! In your application, you will likely be consuming our REST API. Take a look at Scramjet Transform Hub's API documentation here.

5. Let's clean up.

The program will run forever unless you stop it first. Let's try to stop the service now:


_10
si inst stop - 3000

The last started sequence will stop in 3 seconds (that's the 3000 argument there). You can then delete the sequence afterward:


_10
si seq delete -

Wrap up

And there you go. In this article I've shown how to write and deploy a simple data acquisition program to Scramjet Transform Hub, how to access the data.

Here's a couple of links you may want to see also: