Regular expressions (regexp) enable you to validate Text input in BRYTER. For instance, you can ensure specific formats like capitalizing party names in a contract: JOHN instead of John.
This article provides tutorials on standard applications of regular expressions in BRYTER. Additionally, we have added a cheat sheet with examples to copy and paste from.
If you want to learn more about regular expressions from external sources, you also can find plenty of good online resources to study, as well as, examples to copy from.
Tutorials for using regular expressions
As mentioned above, you can restrict your user’s response to only use uppercase letters. This is useful when you want to create a contract that requires names to be submitted in all uppercase letters. It’s a good practice to write a placeholder text or tip to instruct your users to only use upper case letters in their response. To do so, enter the following expression:
^[A-Z]+$
You can simply copy and paste this into the appropriate Text input or keep reading for an explanation about the different parts of the Regular Expression.
[A-Z]
This part of the Regular Expression allows the user to enter all uppercase characters from “A” to “Z”. In general, square brackets such as “[ ]” allow you to specify ranges. However, note that this particular range [A-Z] only includes letters used in English. Other uppercase letters, such as “Ä” or “Ü” in German, are not allowed. If you want to include these letters, simply add them to the square brackets as follows:
[A-ZÄÖÜ]
More than one character
+
The plus sign “ + ” following the brackets allows for more than one character to be entered. Without the “+” symbol, only one letter is allowed as a response.
^$
Finally, these two symbols state that the assertion starts from the beginning of the line ^ and should finish at the end of it $. This means that no other characters are allowed in the input other than these specified inside the middle of our regex.
Examples
Using the regex [A-Z]+, allows any input value which includes the characters A-Z, so correct values would be ABC, abC, small Capital and white spaces and an incorrect value would be: abc.
Using the regex ^[A-Z]+$, allows any input value including only A-Z characters, so a correct value would be ABC, an incorrect values would be: abCd, CAPITAL AND WHITE SPACES.
We can extend the code snippet above to check for the same uppercase conditions for exactly two words. That is, from a technical perspective, some characters followed by a space and then some additional characters. Generally, in Regular Expressions, you are creating small validation blocks which you append in a line. You can do this as follows:
^[A-Z]+\s+[A-Z]+
As above, you can simply copy and paste this into the appropriate Text input or keep reading for an explanation about the different parts of the Regular Expression.
[A-Z]+
As discussed in the first example, this snippet allows for more than one uppercase English character.
\s
This allows the user to insert a single space between the two strings of uppercase letters. Using this we accept a “space” between the two uppercase words. In this case, the “ + ” sign allows us to create a chain combining the elements.
^$
If you use these two symbols, nothing else than what was noted inside the regex should appear in the input value.
Let’s say your user has a second surname that needs to be entered into a contract. We need a more flexible validation here. The good thing is, we can simplify the code for our validation like this:
^[A-Z\s]+$
As above, you can simply copy and paste this into the appropriate Text input or keep reading for an explanation about the different parts of the Regular Expression.
To sum up: You can put any uppercase letter in square brackets “[ ]”. Using this pattern, you can determine which letters will or will not be allowed to enter in the Text input. There is no determined structure. You are not limiting the number of letters or words here because this is a basic filter mechanism. It also works with special characters.
Use Regexp to determine whether a SEPA money transfer is national or international
Regular expressions can be found in both Text nodes (custom validations for the user input field) as well as in conditions. Regexp can be used to determine whether a SEPA money transfer is national or international.
Example:
[D][E]\d{20}
In this example, our module splits into two different paths – triggering a different set of questions – depending on whether a SEPA money transfer is national or international. We are setting the condition that IF the user's IBAN starts with DE ( [D] and [E]) and is followed by exactly 20 digits ( \d{20} ), THEN we are dealing with a national SEPA money transfer. In every other case, the module will automatically follow the path for an international SEPA money transfer.
Tip: Get the most out of your module by using regexp in Text nodes and conditions. For example, within the node 'payment information' you can include a regular expression to ensure that the user's IBAN starts with two letters and is followed by a minimum of 15 and a maximum of 31 digits (min. and max. length of IBAN in the SEPA countries).
Let’s say you want to build an email validator. You want to check that your user has input their email address correctly while they are using your module.
An email address needs to contain an “@” and a dot to be approved. Before the “@” you want to allow a variety of characters but after the “@” and the dot, you will be strict with your validation.
Let’s decompose the email address: john.doe@bryter.io
You start with the first part: “john.doe” Here you allow a set of characters and a dot. Your regexp will look like this:
[\w.]+
Remember, \w allows any kind of character and you need to enter the dot manually. The plus will allow more than one character to be entered.
Then, we want to validate the “@”. This step is pretty straightforward because you just need to use a character range [@]. Again, the plus allows more than one character. Finally, you will add this to your previous validation, so your regexp looks like this:
[\w.]+[@]
Now, we want to check for the email provider, “bryter”. Here, you follow the same step as in step 1. This time, however, we don’t need to add the dot. Simply add a range with the \w which looks like this: [\w]+ Now, your updated regex looks like this:
[\w.]+[@][\w]+
Now, we have to check for the dot. You have to add the dot in this place to keep the order. If you were to add the dot in the step above, you would allow multiple dots after the “@”, which you don’t want. So we have to make an explicit validation here: [.]
[\w.]+[@][\w]+[.]
Finally, you want to add the top-level domain: “io”
Finally, you want to add the top-level domain: “io”. Here you only want to allow for a limited set of characters, so you will use:
[a-z]+
This is your final regexp:
[\w.]+[@][\w]+[.][a-z]+
Now you can test it on your own and type in multiple email addresses. You can adjust it to your needs as well. If it should allow special characters at the beginning like “ÄÖÜ”, just add them in the first step within the range indicated in the square brackets.
If you want to specify that the input should not contain anything but the email you should provide the symbols ^$ as well:
^[\w.]+[@][\w]+[.][a-z]+$
Let’s say you build a flight right module that helps people claim compensation for delayed or canceled flights. In the end, a contract will be created which they can send to the airline. A part of that contract is the correct flight number. To avoid errors, validate the flight number input with a Regular Expression.
A flight number contains a two-character airline designator and a 1 to a 4-digit number.
Let’s decompose the flight number: LH 3442
First, validate the first part: “LH”. You need to limit the input to two uppercase characters. Start by specifying a range within square brackets: [A-Z]. Now you need to allow only two characters. To set the number of characters, use the curly brackets and type in the number. In this case, it is {2}. So far, your Regular Expression looks like this:
[A-Z]{2}
After the two first characters, you will need to add a space using \s. Your Regular Expression should now look like this:
[A-Z]{2}\s
Now we need to validate the flight number. We need to allow this to be a 1 to 4 digit number. Remember that for digits, we must use the [\d] expression. Finally, you need to add a range that allows 1 to 4 digits. In this case, you will append {1,4}. The comma acts like a “to”, so the Regular Expression is:
[\d]{1,4}
Your final regexp is:
[A-Z]{2}\s[\d]{1,4}
Again, if you want to specify that the input should not contain anything but the flight number you should provide ^$ symbols as well:
^[A-Z]{2}\s[\d]{1,4}$
Cheat sheet
\s
This allows for a single whitespace. Using this, we accept a “space” between words or letters.
\w
This allows for a single character. Using this, we can accept a single letter or a number. We do not accept special characters like “ÄÖÜ” or “!?%”. In that case, you would have to define the character set.
\w+
This allows for multiple characters. Here we can allow a word to be entered without limiting the length of the word.
\d
This allows for a single digit. The user can type in a number with one digit between 0 – 9.
\d+
This allows for multiple digits. The user can enter any kind of number but without a dot or a comma.
[abc]
This allows for any kind of letter or special character like “!?%”. Remember that in this case only one character is allowed to be typed in by the user. This validation is case-sensitive. For example, if you want to support an “a” and an “A” character, your regular expression should look like this: [aA]
[abc]+
This allows for multiple characters.
[a-z]
This allows for any kind of lowercase letter between “a” and “z”.
[a-z]+
This allows for multiple lowercase letters between “a” and “z”.
[A-Z]
This allows for any kind of uppercase letter between “a” and “z”.
[A-Z]+
This allows for multiple uppercase letters between “a” and “z”.
[0-9]
This allows for any kind of digit between “0” and “9”.
[0-9]+
This allows for multiple digits between “0” and “9”.
[^abc]
This forbids any kind of character which is in these square brackets. Using “^” at the beginning of your set or range is reversing the behavior. If you want to forbid a special character like “!?%” then you would type in: [^!?%].
[^abc]+
This forbids characters that are in this range when the user types in multiple characters.
{5}
This specifies how many characters are allowed. If you want a country code that allows only two digits, your regexp will look like: [A-Z]{2}
{1,5}
This specifies the number of characters that are allowed based on a range, such as anything from 1 character to 5 characters.
^$
Remember to use these symbols if you want to assure your regex rules should be applied from the beginning or/and till the end of the input.
It’s worth remembering, that although \d+ regex means “multiple digits” it actually means: “containing at least one digit”. By adding ^ we convert it to “starts with digits only” (^\d+), by adding $ we convert it to end with digits only (\d+$). Finally, by adding them both we convert it to “contains digits only” (^\d+$).
Keywords: boolean; regex