Migrating from AngularJS to Vue — Part 1

Author: Rotem Benjamin

Here at Tipalti, we have multiple web applications which were written over the course of the past 10 years. The newest of which is an AngularJS application that was born at the beginning of 2014. Back then AngularJS was the most common Javascript framework, and it made sense to start a new application with it. The following is our journey of migrating and rewriting this app from AngularJS to a new VueJS 2 app.

Why Migrate?

Google declared in 2018 that AngularJS is going into support only mode and that will end in 2021. We wouldn’t want our application using a framework without any support. So the sooner we can start migrating the better, as migration can take quite some time.

Why Vue?

As of writing this, there are 3 major players in the front-end framework ecosystem: Angular, React and Vue.
It may appear that for most people migrating from angularJS to Angular is a better idea than a different framework. However, this migration is as complicated as any other migration option, thanks to Google’s total rewriting of the angular framework.

I won’t go into the whole process we underwent until finally deciding to go with Vue, but I’ll share the main reasons we chose to do so. You can find a lot of information on various blogs and tech sites that support these claims.

  • Great performance — Vue is supposed to have better performance than Angular and according to some comparisons better than React as well.
  • Growing community and GitHub commits — It is easy to see on Trends that the Vue community is growing with each passing month and it’s GitHub repo is one of the most active ones.
  • Simplicity — Vue is so easy to learn! Tutorials are short, clear and in a really short time, one can gain enough knowledge needed in order to start writing Vue applications.
  • Similarity to AngularJS — Though it is not a good reason on its own, the similarity to AngularJS makes learning Vue easier.

You can read more reasons in THIS fine post.

What was our status when we started?

It is important to mention that our application was and still is in development, so stopping everything and releasing a new version 6 months (hopefully) later was not an option. We had to take small steps that would allow us to gradually migrate to Vue without losing the ability to develop new features in our application going forward.

During the course of the application development, we followed best practices guides such as John Pappa’s and Todd Motto’s.

The final version of our application was such that:

  • All code was bundled with Webpack.
  • All constant files were AngularJS consts.
  • All model classes were AngularJS Factories.
  • All API calls were made from AngularJS Services.
  • Some of the UI was an HTML with controller related to it (especially for ui-route) and some were written as Components (with the introduction of Component to AngularJS in 1.5 version).
  • Client unit testing was written using Jasmine and Karma.

Can it be done incrementally?

Migrating a large scale application such as the one we have into Vue is a long process. We had to think carefully about the steps we would need to take in order to make it a successful one, that would allow us to write new code in Vue and migrate small portions of the app over time.

As you can understand from the above, our code was tightly coupled to AngularJS, such that in order to migrate to Vue some actions were needed to be made prior to the actual migration — and that was the first part of our migration plan.

It’s important to emphasize that as we started the migration, we understood that although the best practices guides mentioned above are really good, they had made us couple our code to a specific framework without a real need to actually do so.

This is also one of the lessons we learned during our migration — Write as much pure JS code as you possibly can and depend as little as possible on frameworks, as frameworks come and go, and you can easily find yourself trying to migrate again to a new framework in 2–3 years. It seems obvious as that’s what SOLID principles are all about, however it’s sometimes easy to miss those principles when ignoring the framework itself as a dependency.

What were our first steps in the migration plan?

So, the actual action items we created as the preliminary stage of our migration plan were detaching everything possible from AngularJS. Meaning modifying code that has little dependencies, but many others depend on it from AngularJS to ES6 style. By doing so we are detaching it from AngularJS ecosystem. Practically, we transformed shared code to be used by import/export instead of AngularJS built-in dependency injection.

To wrap things up, we created the following action items:

  • Remove all const injection to AngularJS and use export statements. Every new const will be added as an ES6 const and will be used by import-export.
  • Update our httpService (which was an AngularJS service) which is the HTTP request proxy in our app. Every API call in our application was made using this service. We replaced the $http dependency with axios and by doing so created an AngularJS-free httpService. Following that, we removed the injection of httpService from every component in our app and included it with an import system.
  • Create a new testing environment using Chai, Sinon, and Mocha, which allows us to test ES6 classes that are not part of the AngularJS app.
  • Transfer all BL services from AngularJS to pure ES6 classes and remove their dependency from our app.
  • Create a private NPM with shared vue controls what will be used by every application we have. This is not a necessity for the first step to full migration, but it will allow us to reuse components across multiple apps and have them the same behavior and styling (which can be overridden of course), which is something that is expected from different apps in the same organization.

What’s next?

As for our next steps, every new feature will be developed in Vue and will be for now part of the AngularJS app by using ng-vue directive, and at the same time, we can start migrating logical components and pages from AngularJS to Vue by leveraging that same great ng-vue directive.

We did migrate one page to ng-vue and added new tests for the vue components we created, in addition to embedding Vuex that will eventually take control of the state data.
Once we finished all the above, we were ready to take the next step of our migration….
More to follow on the next post.

NodeJS MS-SQL integration testing with Docker/Mocha

Author: Zachary Leighton

Integration testing vs. unit testing

Unit tests are great for testing a function for a specific behavior, if you code right and create your “mocked” dependencies right you can be reasonably assured of your code’s behavior.

But our code doesn’t live in isolation, and we need to make sure all the “parts” are connected and working together in the way we expect. This is where integration tests come in to play.

Charlie Sheen didn’t write unit tests, and look where that got him

A good way to explain the difference would be that a unit test would test that a value (let us say an email for simplicity) passes a test for business logic (possibly a regex or something — maybe checks the URL) and the email and rules would be provided as mocks/stubs/hard-coded in the test, while an integration test of this would check the same logic but also retrieve the rules and value from a database — thus checking all the pieces fit together and work.

If you want more examples or want to read up a bit more on this there are great resources on Medium, as well as Stack Overflow, etc. The rest of this article will assume you are familiar with NodeJS and testing it (here we use Mocha — but feel free to use whatever you like).

Pulling the MS-SQL image

He used Linux containers 🙂

To start you’ll want to pull the Docker image, simply run the command docker pull microsoft/mssql-server-linux:2017-latest (Also if you haven’t installed Docker you might want to do that too 😃)

This might take a few minutes depending on what you have installed in your Docker cache.

After this is done, please make sure to right click, go to “Settings…” and enable: “Expose daemon on tcp://localhost:2375”. As we will see in a few sections this needs to be set to process.env.DOCKER_HOST for the Docker modem to run correctly.

Delaying Mocha for setup

Since we need a few moments to spin up the container and deploy the schema we will use the --delay flag for Mocha.

This adds a global function run() that needs to be called when the setup is done.

You should also use the --exit flag which will kill Mocha after the test run, even if a socket is open.

Preparing the run

In this example, we use the --require flag to require a file before the test run. In this file an IIFE (immediately invoked function expression) is used because we need to call some async functions and await them, and then call the done() function from above. This can be done with callbacks but it is not so clean.

The IIFE should end up looking like this:

(async () => {
    const container = require('./infra/container');
    await container.createAsync();
    await container.initializeDbAsync();
    run(); // this kicks off Mocha
    beforeEach(async () => {
        console.log('Clearing db!');
        await container.clearDatabaseAsync();
    });
    after(async () => {
        console.log('Deleting container!');
        await container.deleteAsync();
    });
})();

Spinning up the container from Node

In the above IIFE we have the method container.createAsync(); which is responsible for setting up the container.

const { Docker } = require('node-docker-api');
const docker = new Docker();
...
async function createAsync() {
    const container = await docker.container.create({
        Image: 'microsoft/mssql-server-linux:2017-latest',
        name: 'mssqltest',
        ExposedPorts: { '1433/tcp': {} },
        HostConfig: {
            PortBindings: {
                '1433/tcp': [{ HostPort: '<EXPOSED_PORT>' }]
            }
        },
        Env: ['SA_PASSWORD=<S00p3rS3cUr3>', 'ACCEPT_EULA=Y']
    });
    console.log('Container built.. starting..');
    await container.start();
    console.log('Container started... waiting for boot...');
    sqlContainer = container;
    await checkSqlBootedAsync();
    console.log('Container booted!');
}

The container is created from the async method docker.container.create , the docker instance needs to have process.env.DOCKER_HOST set, in our case we have a local Docker server running (see: Pulling the MS-SQL image) so we’ll use that.

The options come from the modem dockerode and it uses the Docker API.

After the container spins up we need to check that SQL finished running, our port is <EXPOSED_PORT> and the password is <S00p3rS3cUr3> (these are placeholders so make sure you put something valid).

If you want to read more about what is happening here with the EULA option, etc. check out the guide here from Microsoft.

Since it takes a few seconds for the SQL server to boot up we want to make sure it is running before firing off the test suite. A solution we came up with here was to continually try and connect for 15 seconds every 1/2 second and when it connects, exit.

If it fails to connect within 15 seconds something went wrong and we should investigate further. The masterDb.config options should line up with where you’re hosting Docker and on what port you’re exposing 1433 to the host. Also remember the password you set for sa .

async function checkSqlBootedAsync() {
    const timeout = setTimeout(async () => {
        console.log('Was not able to connect to SQL container in 15000 ms. Exiting..');
        await deleteAndExitAsync();
    }, 15000);
    let connecting = true;
    const mssql = require('mssql');
    console.log('Attempting connection... ');
    while (connecting) {
        try {
            mssql.close();
// don't use await! It doesn't play nice with the loop 
            mssql.connect(masterDb.config).then(() => {
                clearTimeout(timeout);
                connecting = false;
            }).catch();
        }
        catch (e) {
            // sink
        }
        await sleep(500);
    }
    mssql.close();
}

Deploying db schema using Sequelize

Fun Fact: Liam Neeson used Docker to release the Kraken as well.

We can quickly use Sequelize to deploy the schema by using the sync function, then as we will see below it is recommended to set some sort of flag to prevent wiping of a non-test DB.

First though, we want to actually create the db using the master connection. The code will end up looking something like this:

async function initializeDbAsync() {
    const sql = 'CREATE DATABASE [MySuperIntegrationTestDB];';
    await masterDb.queryAsync(sql, {});
    await sequelize.sync();
    return setTestingDbAsync();
}

Safety checks

Let’s face it, if you’ve been programming professionally for any reasonable amount of time — you’ve probably dropped a database or file system.

And if you haven’t go run out and buy a lotto ticket because man you’re lucky.

This is the reason to set up infrastructure for backups and other things of the sort, roadblocks if you will, to prevent human error. While this integration test infrastructure you just finished setting up here is great, there is a chance you may have misconfigured the environment variables, etcetera.

I will propose here one possible solution, but feel free to use your own (or suggest more in the comments!).

Here we will use the SystemConfiguration table and have a key value pair on key TestDB that’s value needs to be truthy for the tables to be truncated. Also at multiple steps I recommend checking the NODE_ENV environment variable to be test which can make sure you didn’t accidentally run this code in a non-test environment.

At the end of the last section we saw the call to setTestingDbAsync the content is as follows:

async function setTestingDbAsync() {
    const configSql =
        "INSERT INTO [SystemConfiguration] ([key], [value]) VALUES (?, '1')";
    return sequelize.query(configSql, {replacements: [systemConfigurations.TestDB]});
}

This sets the value in the database, which we will check for in the next snippit. Here is a snippet of code that will check the existence of a value on the key TestDB (provided from a consts file) that we just set.

const result = await SystemConfiguration.findOne({ where: {key: systemConfigurations.TestDB }});
    if (!result) {
        console.log('Not test environment, missing config key!!!!');
        // bail out and clean up here
    }
// otherwise continue

Wiping the test before each run

Taking the code above and combining it with something to clear the database we come up with the following function:

const useSql = 'USE [MySuperIntegrationTestDB];';

async function clearDatabaseAsync() {
    const result = await SystemConfiguration.findOne({ where: {key: systemConfigurations.TestDB }});
    if (!result || !result.value) {
        console.log('Not test environment, missing config key!!!!');
        await deleteAndExitAsync();
    }
    const clearSql = `${useSql}
       EXEC sp_MSForEachTable 'DISABLE TRIGGER ALL ON ?'
       EXEC sp_MSForEachTable 'ALTER TABLE ? NOCHECK CONSTRAINT ALL'
       EXEC sp_MSForEachTable 'DELETE FROM ?'
       EXEC sp_MSForEachTable 'ALTER TABLE ? CHECK CONSTRAINT ALL'
       EXEC sp_MSForEachTable 'ENABLE TRIGGER ALL ON ?'`;
    await sequelize.query(clearSql);
    return setTestingDbAsync();
}
async function setTestingDbAsync() {
    const configSql = "INSERT INTO [SystemConfiguration] ([key], [value]) VALUES (?, '1')";
    return sequelize.query(configSql, {replacements: [systemConfigurations.TestDB]});
}

This will check for the existence of the value for key TestDB in the SystemConfiguration table before continuing. If it isn’t there it will exit the process.

Now how does this run within the context of Mocha?

If you remember in the IIFE we had a call to beforeEach , this is where you want to have this hook so that you have a clean database for each test.

beforeEach(async () => {
        console.log('Clearing db!');
        await container.clearDatabaseAsync();
    });

Shutdown / Teardown

You don’t want to leave the Docker in an unknown state, so at the end of the run simply kill the container, you’ll want to use force too.

Docker reached out to us and said they don’t use exhaust ports

The after look looks like this:

after(async () => {
        console.log('Deleting container!');
        await container.deleteAsync();
    });

And the code inside container.deleteAsync(); looks like this:

async function deleteAsync() {
    return sqlContainer.delete({ force: true });
}

Putting it all together

Since this article was a bit wordy, and jumped around a bit here are the highlights of what to do to get this to work:

  • Delay Mocha using --delay
  • Require a setup script and use an IIFE to set up the container/DB
  • Spin up a Docker container instance, wait for SQL to boot
  • Deploy the schema using Sequelize and also put in a safety check so we don’t wipe a non-test DB.
  • Hook the wipe logic into the beforeEach hook
  • Hook the teardown logic into the after hook
  • Create amazing codez and test them

I hope you enjoyed this article, and suggestions, comments, corrections and more memes are always welcomed.

Good luck and happy testing!

The Leap from .NET to Linux – Using PM2 with sensitive configurations in production

Authors: Shmulik Biton, Ron Sevet

It’s not very often in the career of a programmer that one has the privilege to deploy a brand new application to production. We were lucky to be in that position. This was even more special since this was a completely new technology stack that we were introducing here at Tipalti. Our entire server-side stack is based on .NET and the various Microsoft products that go along with it. We have an extensive experience with .NET’s best practices, tools, how to run our applications and how to secure them.

This wasn’t the case here. Our new application is written in Node.js and we wanted to run it on Linux for better performance. This meant we had to find how to apply the practices and capabilities from our .NET stack on the Linux stack.

We had two major tasks, the first was to find a capable solution for running our node application. We searched around and found PM2. PM2 is a very comprehensive process manager with many useful features. It’s popular, well maintained and looked very promising.

Securing Configurations

After setting up PM2, we needed a way to secure our sensitive configuration keys like DB credentials and alike. In .NET, you can simply use the Windows Protected Configuration API for a seamless experience. Your web.config is encrypted using the user’s credentials and .NET handles it without too much fuss. You don’t need to store any encryption keys next to the config so it’s a pretty good solution.

In Linux, there isn’t something similar, and after some research, we found a best practice that seemed reasonable to us: storing sensitive configurations as environment variables. The reasoning here is that accessing this file requires either root access or the ability to run code on the machine, which in that point there is not much you can do to keep those variables a secret.

At first, all went well. PM2 performed as expected, and using the environment variables was an easy solution.

The issues started after we had duplicated our server and had to change some of those environment variables. We started getting weird errors that we did not see before. We found out that we were still using the old configurations. Using the `pm2 reload` command did not have any effect. Nor the `restart` command or even restarting the server itself. This was very baffling. It turns out that PM2 was the culprit here. One of PM2’s features forces you to explicitly reload environment variables in order to get the updated values. The flag to use is `–update-env`. This seemed to fix the issues we thought the problems are behind us.

We were wrong. After some time has passed another config change was required and this time the flag did not work. Nothing seemed to work. Rebooting the server did not have any effect either. We figured there must be some sort of a cache PM2 was using to store the application state. This was indeed the case.

We used the command `pm2 save` which creates a dump file on ~./pm2/. When the app started/reloaded/restarted it used that dump file to continue where it left off. This behavior was not desired for two reasons: First, it stopped us from updating our configuration. Secondly, it stored sensitive configurations in the home directory of the user running our app. This can potentially expose us to file traversal attack vectors.

Solving the Issue

We solved it by deleting the dump file and stopped using the `pm2 save` command. We also changed the default startup script PM2 creates when you call `pm2 startup`.

This is our final startup script:

[Unit]
Description=PM2 process manager
Documentation=https://pm2.keymetrics.io/
After=network.target

[Service]
Type=forking
User=
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
EnvironmentFile=/your/env/file
Environment=PATH=/usr/bin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
Environment=PM2_HOME=/home/integration/.pm2
PIDFile=/home/integration/.pm2/pm2.pid

ExecStart=/usr/lib/node_modules/pm2/bin/pm2 start /home/integration/.pm2/ecosystem.config.js
ExecStartPost=/usr/lib/node_modules/pm2/bin/pm2 reload /home/integration/.pm2/ecosystem.config.js
ExecReload=/usr/lib/node_modules/pm2/bin/pm2 reload /home/integration/.pm2/ecosystem.config.js --u
ExecStop=/usr/lib/node_modules/pm2/bin/pm2 kill

[Install]
WantedBy=multi-user.target

Note that you need to specify in the startup script the environment file explicitly, otherwise, it won’t load it, and since we are already specifying the file path, we decided to use a dedicated file with root permissions instead of /etc/environment. This would prevent a general process to have access to our configuration.

A quick note on key store solutions: We thought it would add another layer of complexity without adding much security. If someone got root access to the machine they could get the configuration no matter what we do. This is why we decided not to go that route.

In summary

If you want to run a Node.js application under Linux using PM2, these are the steps that we found worked best:

  1. Put sensitive configuration variables in a root accessible file
  2. Do not use the `pm2 save` command
  3. Change the startup script as shown above
  4. Reload your application after changing the configuration using
    `pm2 reload <your config file name> –update-env`

White labeling software UI/UX using LESS

What is White Labeling?

“White Labeling” (also known as “skinning”) is a common UI/UX term used for a product or service that is produced by one company and rebranded by another, giving the appearance that the product or service was created by the marketer (or the company s/he represents). Rebranding has been used with products ranging from food, clothing, vehicles, electronics and all sorts of online wares and services.

For many of Tipalti’s customers, white labeling is key to providing a seamless payment experience for payees such as publishers, affiliates, crowd and sharing economy partners, and resellers. Rather than take a user out of one portal to a third party site and break the communication chain, Tipalti enables the paying company to deeply embed the functionality within their own experience.

How does white labeling work in Tipalti?

Tipalti’s payee portal is fully customizable, allowing our customers to match their corporate brand to every step of the onboarding and management process. For example, entering their contact data, selecting their payment type, completing tax forms, and tracking their invoicing and payment status are all aspects that a payee would need to see and can be presented with the customer’s brand. Customers use their own assets to ensure a seamless look and feel in any payee-facing content.

To support the wide array of branding while maintaining a consistent code-base for all customers, we built the payee portal using LESS.

What is LESS?

CSS preprocessors are a staple in web development. Essentially, CSS pre-processors extend plain CSS into something that looks more like a programming language.

LESS was created in 2009 by Alexis Sellier, also known as @cloudhead, and it has become one of the most popular preprocessors available and has also been widely deployed in numerous front-end frameworks like Bootstrap. It adds to CSS programming traits such as Variables, Functions or Mixins, and Operations, which allows web developers to build modular, scalable, and more maintainable CSS styles.

Click here for more information on how LESS works, syntax and tools.

How LESS is used for white labeling?

Let’s assume you want to use LESS for styling a blog post. Your LESS CSS stylesheet will look something like this:

@post-color: black;
@post-background-color: white;
@sub-post-color: white;
@sub-post-background-color: black;

.post {
	color: @post-color;
	background-color: @post-background-color;
	.sub-post {
		background-color: @sub-post-background-color;
		color: @sub-post-color;
	}
}

With this styling, the blog post will roughly look like this:

less-black

After awhile, let’s say you made contact with a new business partner. The partner wants to display your blog post inside their website, but since the color scheme of the blog post does not match the general style of the partner’s website, they want to create a custom styling that will match their color scheme. This is essentially the process of “white labeling.”

So how do you change the styling to match both color schemes?

This all can be done only in plain CSS, but imagine having to override each CSS class in your blog stylesheet for each theme. Not that attractive, right?

If you are using LESS, the solution is simple. You’ll just override the variables related to each property you are willing to change. The changes are all concentrated in a single location, which makes applying them much easier, faster and less prone to errors.

If your partner prefers the post background color to be blue, you will simply need to do the following:

  1. Create another LESS file
  2. Inside of it add the line “@post-background-color: blue; 
  3. Compile the new LESS file after compiling the original one

That’s all it takes to achieve the wanted layout change. No need to worry that you forgot to update any CSS classes.

less-blue

Complex LESS customizations in Tipalti

For our product, each Tipalti customer can easily customize their own payee portal simply by overriding variables in their own LESS file.

The array of possible customizations reaches far more than colors. For example, our customers are able to change the layout of fieldsets, icons, buttons, elements positions, width/height calculations and more. Because of LESS, the rebranding never interferes with the common code, which is important as there is a lot of intelligence built into our payee portal. Likewise, the customer never has to write any custom code around the logic and execution of the processes. When you’re dealing with something as complex as global payments and validations on payment methods, this is incredibly important.

Here are examples of two very different LESS override stylesheets applied on the same base stylesheet:

less-example-1

less-example-2

Other uses for LESS

LESS is very useful when you’re developing services for customers that feature their brand, but it’s also very clean and efficient for creating a flexible user interface and user experience for your product. For example, when you have many repeated elements that need to be styled, LESS can simplify the effort with variables, operators, and mixins for cross-browser and mobile/responsive support.

Software Testing in Remote Control – Offshoring QA

The term Global Village looked new in the past and came to describe the removal of geographical boundaries, especially in the era of mass media and the Internet. Today it already seems obvious to us that a project (not necessarily in the high-tech industry) might import raw materials from China, use components manufactured in the United States and Germany, and be assembled in Japan. Geographical barriers are already removed for software projects that, for example, accept the requirements from the US product team, develop the software in the UK, and send it to India for testing.

Pro’s and Con’s of Offshore Software Testing

The decision to outsource software testing to an offshore team takes into account several considerations.

The main obstacle in offshore testing is literally, the team being offshore. It’s true, we have communication channels like Skype, and other software, allowing chats in text, audio or video but still, it’s hard to compare to simply walking to the neighboring office or cubicle and actually talking to the tester.

Also, as Agile and Scrum methods are relied on more and more, using an offshore testing team might be a challenge since a stand-up meeting in Skype is just not that intuitive (and sometimes impossible because of the time difference) and integrating the testers might be complex and takes effort and patience.

Another possible issue is language and local culture differences. Most workers do know basic English, but still getting used to reading documents and emails with a lot of grammar mistakes or with phrases you don’t understand can get annoying.

Still, there are some benefits here. Most of the time, development or testing are outsourced because of cost considerations. So, sure, in some countries, the cost for testing or development is less than other countries, but that’s not necessarily the only benefit.

Working with an outsource company, you won’t need to manage the recruitment process, which can be tedious and hard, as the outsource company will take care of that for you.

In addition, for countries like Israel where our development center resides, whose working days are Sunday-Thursday, having a testing team that works on Friday and part of Saturday is a big plus as you can have code ready for testing on Thursday, go for your weekend, and come back next week with the feature already tested.

Some Tips

For conclusion, a few tips for working with offshore testing teams:

  • Make sure there are direct lines of communication between the testing team and everyone else (development and product teams).
  • Make sure you maintain everyday contact with the offshore testing team, like you would do when they are on site.
  • Always make sure the testers are in the loop. They need to stay on top of issues and get updated on every change made during the design and development stages. Make them feel like part of the team and inspire them to work for the good of the project, not just their paycheck.
  • Encourage your offshore team to make suggestions to the work process and the product in general, making sure their voice is heard.
  • The work tasks have to be understood, especially for complex features. One way to try and ensure it is allowing the offshore team to repeat the workflow described in the feature to make sure they got it right.

Challenges of Name String Comparisons – Keeping Our Customers Compliant

Name scanning in Tipalti

Multiple organization in the US (OFAC, Dept. of State, BIS, etc.) publish lists containing names of individuals and companies that are targeted under one or more sanction programs.

In the event that an entity from one of these sanction lists is potentially involved in a money transfer, additional due diligence must be conducted before proceeding.
Tipalti helps its customers maintain compliance by scanning these lists for every payee being paid.

This proves to be a challenging process. The difficulties stem from two main issues:

  1. The comparison is a name-to-name comparison; meaning we need to compare two strings that can vary highly but still be equivalent linguistic-wise.
  2. Existing algorithms that are commonly used for name comparisons can have poor performance with foreign names that are transcribed to English.

Common algorithms for name comparison and why they don’t always work

Our scanning process is comprised of multiple stages in which we run algorithms like Jaro-Winkler distance, Levenshtein distanceSoundex similarity, etc. Each algorithm operates at a different stage of the process and has a different purpose, but there are times when even using all of these algorithms, the result is not sufficient.

Let’s look at the following example: The Chinese name 习 is usually transcribed to English as “Xi” but since the approximate pronunciation in English is “Shi” or “Shee” we need to treat those as equivalent. How do the algorithms mentioned above perform when comparing “Xi” with “Shi”?

Comparison of "Xi" and "Shi"

Alogrithm		Score
---------		-----
Levenshtein		33 (out of 100)
Jaro-Winkler	61 (out of 100)
Soundex			NOT EQUAL (X000 vs S000)

As you can see, all of the scores here are too low to signal that there’s a match.

Let’s see how we handle these cases in Tipalti.

Tipalti’s phonetic rules engine

Given the problem of non-English names that can be transcribed to English in multiple forms, we needed to design a solution that can generate all the different forms of the same name.

We do this by creating rule-based patterns that can describe multiple forms of the same string. Here’s an example:

Let’s assume my Hebrew name can be transcribed to English as both “Shauli” and “Shaoli”. In this case, we can create a rule that allows us to replace “ao” with “au”. This rule would look like “{au,ao}”, and a matching pattern for my name would like “sh{au,ao}li”. This pattern describes both forms of my name.

We can also incorporate two rules in the same pattern. For example, let’s add another rule for the suffix of my name: “{li,ly,lee}”.

Now when applying these two rules together we will get the following pattern “sh{au,ao}{li,ly,lee}” which matches the following strings: “shauli”, “shauly”, “shaulee”, “shaoli”, “shaoly”, “shaolee”.

In order to create this set of rules, we use the help of linguistic experts that specialize in different languages.

Let’s take a look under the hood of our phonetic rules engine.

How does the phonetic rules engine work?

This is a general outline of the engine algorithm (pseudo-code):

// This method receives a name (e.g. "shauli") and a set of phonetic rules (e.g. "{ao,au}")
// and returns all the equivalent strings this rules allow (e.g. "shauli", "shaoli").
GenerateEquivalentNames(name, rules)
	1. Find all the appearances of the rules in the name.
	2. Integrate the rules into the name in its matching positions (e.g. the output will look like "sh{au,ao}li").
	3. Generate all string combinations from the above pattern.

After generating the list of equivalent names, we scan each of them against the sanctions lists.

Conclusion

This approach may increase false positives, but that’s preferred for security controls. It’s better to limit more conservatively at first, and then allow access once you have some form of human verification rather than be irresponsible and less discriminant. We’re trying to adhere to a federal law, after all.

If you have specific questions of how the engine works, please feel free to contact me or leave a comment.

Node.js vs. IIS + WebAPI async – Performance Test

Everyone today has probably heard of Node.js which is commonly used for handling large amounts of requests. It uses a single-threaded, event-based model, that runs blocking I/O operations asynchronously, keeping the main thread open and ready to handle more incoming requests. Microsoft’s IIS web server however, has a reputation for being slow and bulky.

There’s a good reason for it – by default, ASP.NET handles requests in a blocking manner – each request will take up a thread until it completes. If more requests come in than there are threads in the thread pool, they will be queued. But that’s not the only way to write .NET web code! ASP and .NET have supported async processing for a very long time now, but relatively few people use it in their code.

Since IIS is what we’re using here at Tipalti, we were curious, how does Node.js compare to IIS + .NET using the best async practices?

Read More »