Matchcode FAQs
General Questions
Data Formats
Creating A New Job File
Running Batches
Typical Scenarios
Troubleshooting
Matchcode Fields
Matchcode Flags
Glossary
General Questions
What is the purpose of Matchcode
software?
Matchcode is an address
management package, developed to identify UK addresses using the Royal Mail PAF file as a
reference source. It can also use other data sets such as Commercial PAF. Processing can
be in either batch or interactive (Manual) mode.
Once Matchcode has
successfully identified a UK address it can do a number of things including:
- Return a clean version of
the address as stored on PAF.
- Supply a postcode for the
address.
- Supply a number of other
address related codes and information associated with the address.
Can my address file be damaged or
changed in any way, when using Matchcode?
No. Matchcode never writes
to your original file at all. Each time you process a file you create a new output file
only. This allows you the freedom to experiment with different settings until you are
satisfied with the results.
Does Matchcode produce any kind of
report to show how successful a batch has been?
Yes, to enable the log file
option, open both the Batch and Job Control programs. Then from the Options menu of each,
select Log File. This will open a dialog box in which you can opt to save processing
results in a log file. Once this has been done once, Matchcode will always create a log
file containing processing statistics in the future. The log file will be created in the
same location as the Job File and will use the Job File name with a .LOG extension. It is
a simple text file that can be viewed with Notepad or a similar application
The log file stores
information about:
- Filenames
- PAF files used
- Batch success criteria
- Input postcode quality
- Input address quality
- Processing speed
- Success level achieved
Data Formats
What types of file can Matchcode
use?
Address information is
stored on company systems in many different forms, from simple spreadsheets to
sophisticated CRM systems. In order to process your address list with Matchcode you need
to produce a text document or file. Most systems provide a facility to
Export/Save As a text file. Usually this is called a CSV file or
Character Delimited Values. This type of file is the most suitable for Matchcode to use,
as it requires the least setting up time. Alternatively, a fixed length field/record file
can be used. In either case, it needs to be an ASCII text file.
My file comes from a Unix system and
has only Linefeed record terminators. Can Matchcode handle these?
Yes. Matchcode can handle
all common variants of line termination, including no terminator at all.
Why do I need to export my data into
a text file to use Matchcode?
It would be possible in
some cases to provide access to various file types directly or through OCDB interfaces.
This would make life easier in the short term because it would eliminate the need to
export files as text. However, consider the consequences of working on a live database
file directly. Apart from the record locking issues etc. there would always be the
possibility that you could make a mistake in the output format of your job. This would
effectively destroy your contact information, leaving you to rely on backups to restore
your system. Also you may wish to add new fields to your file, which were not previously
part of the layout.
I have had problems processing my
CSV file, are there any known pitfalls with this type of file?
There is one particularly
common problem associated with CSV type files. This comes about because these files
typically use the Comma character as a field delimiter. In other words, a Comma separates
the individual fields within a CSV type record from each other. Other characters can be
used, such as the Pipe (|) or Hash (#) but this is less common.
A problem sometimes occurs
when fields within the record contain Commas of their own, as part of the text. Normally
this would not be a problem because a well-constructed CSV record that uses Comma
delimiters would have Quote characters around the fields. These are called text
qualifiers.
Example:
000001,MR
JOE SMITH,1 THE CRESCENT, OFF THE HIGH STREET,THE
VILLAGE,THE TOWN,AB1 1AB
As you can see from the
example above, a Comma within the third field separates 1 THE CRESCENT from
OFF THE HIGH STREET. The problem arises when the fields do not have Quotes
around them. When this happens, Commas such as in the above example are interpreted as
additional field delimiters. Obviously this can be a problem because the file suddenly has
an inconsistent format.
To overcome this we should
ensure that exported CSV type Comma delimited files have text fields enclosed in
double-quotes. Alternatively some other delimiter could be used, such as the Pipe
character as mentioned above. If neither of these options is available try outputting the
file as fixed length text.
My system can only produce fixed
length records can I process this kind of file?
Yes you can. By default
Matchcode expects Comma delimited variable length records. But its simple to just change
this by specifying field lengths instead.
Creating A New Job File
What information do I need before I
can clean my address file?
There are three basic
groups of information that Matchcode will need.
- Details about the content
and format of your existing address file.
- Details about the content
and format of your proposed new address file.
- The matching parameters and
levels to use during processing.
This information is
supplied to Matchcode and stored within what we call a Job File. The Job File is created
using the Job File Wizard from within the Job Control software. Once the necessary
information has been entered, the Job File can be saved to disk. Because the Job File is
one single file, it can easily be copied, edited to be reused for further jobs or passes.
To start to create a Job File click New from the Job Control file menu.
I have several non-address fields in
my file. How does Matchcode handle these?
When you are specifying
your Input Fields, you will click the Add button. There are a number of fields shown, the
first in the list is User Field. Any fields that are not in the list, in other words
non-address related or persons name (if Electoral Roll data is not available) are
specified as User Fields. If these fields are also put into the output format, then
Matchcode will simply transfer this information across to the new file untouched.
If my input records are fixed
length, can I output CSV?
Yes you can. Within
Matchcode Job Control you can define the input and output file formats completely
independently of each other. So it is very simple to output CSV from a fixed length
record. In fact each field of both input and output can be defined independently.
I have a huge number of fields in my
file, including some large notes fields. Can I process this file?
Matchcode can handle a
large number of fields but notes fields, can be a problem. Many Windows applications that
support notes fields allocate several thousand bytes per notes field. Matchcodes
cannot cope with fields this big. Ideally in a situation such as this, we would recommend
exporting just the relevant fields plus some kind of unique identifier. By doing this it
makes setting up a job quick and simpler. Once batch processing has finished you can
import the file and update the relevant fields using the unique identifier as your link.
Are there any good general settings
to use when outputting new addresses?
PLEASE NOTE: These settings are provided as a guide only
and should not be used without careful testing.
| Consumer Address File (Names Level
Assuming Electoral Roll is available) |
| Success Criteria |
Postcode to Delivery Point Suffix
level
Voter Name |
| Advanced Options |
Accept postcode changes
Output the input address if matching is unsuccessful
Names Option = Address Improvement mode |
| Fuzzy Matching |
Advanced fuzzy
Max Towns=3 |
|
| Consumer Address File (Address Level
Only) |
| Success Criteria |
Postcode to Delivery Point Suffix
level |
| Advanced Options |
Accept postcode changes
Output the input address if matching is unsuccessful |
| Fuzzy Matching |
Advanced fuzzy
Max Towns=3 |
|
| Business Address File (Matching to
Company Name level) |
| Success Criteria |
Postcode to Delivery Point Suffix
level and Address Key |
| Advanced Options |
Accept postcode changes
Output the input address if matching is unsuccessful
Assign an address key only if top of address matches |
| Fuzzy Matching |
Orthographic
Max Towns=3 |
|
| Business Address File (Where we are
not outputting a new Company name) |
| Success Criteria |
Postcode to Delivery Point Suffix
level |
| Advanced Options |
Accept postcode changes
Output the input address if matching is unsuccessful |
| Fuzzy Matching |
Orthographic
Max Towns=3 |
|
| Any File (Postcode Level Only) |
| Success Criteria |
Postcode |
| Advanced Options |
Accept postcode changes |
| Fuzzy Matching |
Orthographic
Max Towns=3 |
For more information about
Fuzzy Matching settings see General
Running Batches
Can anything be done to increase the
speed of a batch?
Yes, several factors impact
upon the performance of the Batch software.
- With regard to the machine
itself, Memory and disk I/O speed directly affect the speed. More memory and faster disk
I/O will improve performance significantly. This is because the process of searching PAF
involves heavy disk I/O and so the quicker this information can be retrieved and the more
that can be committed to cached memory the better. Also, and for the same reasons as
above, it is helpful to sort the input file on the postcode and address fields to organise
the records into address order as much as possible.
- Other factors include not
setting the location of the RCDB files and other additional data sets, if these are not
required. This is because Matchcode will retrieve information from these files whether
they are used in the output file or not.
- Poor input address quality
will cause Matchcode to spend longer searching for a match though of course this cannot be
avoided.
- Make sure that both data
files and the PAF database are held on the local machine rather than across a network if
possible.
You may also consider:
- Not checking for foreign
addresses.
- Not using input contact
names (If Electoral Roll data is available) if this is not providing useful benefits.
- If the file contains a lot
of unused, non-address fields, try creating a cut-down file of address related fields
only.
I have several batches to run. Do I
need to wait until a batch has finished before starting another one?
No. Several batches can be
run simultaneously, though this will of course affect the speed of each running batch. The
number of batches that can be run at the same time is limited by the available memory.
My address file contains several
million records. Can Matchcode cope with this?
Yes it can. Matchcode deals
with each record sequentially so it doesnt matter how many records you have in the
file. However you may want to consider splitting the file up into smaller batches simply
because other external factors may cause a problem with a batch and so it may be sensible
to not put all of ones eggs in one basket. Having said this, a reasonably well-specified
machine, dedicated to running the batch should be able to process more than a million
records overnight. Also you may wish to bear in mind that the output file may be as large,
if not larger than the input file. So make sure that you have enough available disk space
to handle this output file.
Typical Scenarios
I want to get as many matches from
my file as possible, I dont mind spending a little extra time on it. What can I do?
A strategy of Multiple Passes is often the best way to go. Starting with the
entire file and very strict match settings we can put successful matches aside, and
continue with just the rejects. This process can be repeated, reducing the match level
with each pass until no more matches can be found.
My file contains a mixture of
consumer and business addresses. Can Matchcode process this file?
Yes. Matchcode can handle
files of mixed address content. It may be best to run multiple passes, see Multiple Passes. In this way Matchcode can be tuned to get
optimum results for each type. Each pass can be set to pick up certain address types. This
method is particularly effective if Electoral Roll data is available. For example Pass1
could be set to use names and the criteria can be set to insist on a persons name match.
In this way all the private addresses, which have matched names, can be removed from the
file as successful matches first. Secondly a pass to insist on Company name matches can be
used. Gradually we are left with the addresses that do not easily fit into these
categories.
I often clean the same files
regularly. Can I reuse an old Job File that was used for this file previously?
Yes you can. Simply make a
copy of it using Windows Explorer or some similar program. Rename the copy to something
meaningful and then open it. You will need to change the Input and Output filenames and
also re-count the records. Once you have done this you can save and run the new batch.
I want to update my addresses.
Should I replace them in my Output Format layout or append to my records?
You could simply create an
output format that mirrors your input format with Formatted Address Lines substituting
your Input Address Lines. And while this is a perfectly acceptable method, it imposes some
limitations. For example it is useful to include a Process Code
within the output records to indicate the records that have been matched. Also there are
situations where you may want to carry out some post-processing on your file, which would
be easier to achieve if new fields were appended. Another situation that requires access
to a more complete set of output format arises when we have sole-trader type addresses
within our file, see Some
successfully coded records have only partial output addresses, or incorrect building
numbers. Why?
Usually we would recommend
appending information to your records and therefore giving yourself the opportunity to
examine and update the resulting file using a database package if required.
I have a file of very poor quality
addresses. Is there any way to flag the hopeless cases?
It is difficult to suggest
a reliable method of doing this because simply setting a very low match level, say area
code, does not guarantee that failed records are actually of no use. In fact the reverse
can sometimes be the case. Take for example a situation where we can match some of these
records satisfactorily to postcode level, and yet others cannot even be assigned area
codes. It may seem that the records that matched to postcode level stand a far better
chance of being good, and possibly suitable candidates for manual processing than those
that failed to reach area code. However, you may discover that the postcode level records
failed because having been matched to street level quote easily, they were rejected
because the building number or name on the input simply does not exist, and could never be
found. On the other hand the records that failed to reach area code, may just need slight
correction to the town name for them to fall fully into place, their building names or
numbers being accurate but let down by poor area (town/locality) information. So you see,
it is not easy to make a reliable rule that can be applied to identify such hopeless
records.
That said, it might be
possible to run a pre-process to remove records that have very little information in the
address fields. In most cases though, creating and running such a process would take
longer to complete than it would take Matchcode to process and reject them.
My available address layout is only
4 fields of 30 bytes. Can Matchcode produce an address this small?
Yes but you need to use the
Advanced Address Formatting utility to do it
safely. This set of routines attempts to force an address into a smaller than recommended
space by using various techniques including Word Abbreviation, Field Concatenation, Field
Truncation and Field Elimination. How this is done can be carefully controlled using the
interface provided.
Can anything be done with the
residue records that the Batch system cannot find?
Yes, you could try cleaning
the records manually using the Interactive program. If you open the Job File using the
Interactive program after your batch and select to view uncoded records when prompted. You
can then use the search mechanisms provided to search for and then save your successful
results to the output file. There will typically be around 20%-30% of the residue records
that could be found using manual techniques.
I regularly process the same file,
some records manually; do I have to do these same records each time?
Probably not, you could
consider including Address Keys &
Organisation Keys in your output file format. This would make it possible to run an Address Generation batch on the file at some later stage.
In this way you wouldnt need to search for the addresses each time. Instead you
would do a very thorough job the first time, and once the records were found, either in
batch or interactive mode, they would have Address
Keys & Organisation Keys assigned which could be used to regenerate the required
address fields or postcodes from each new version of PAF.
I have a file of consumer addresses
to clean. I dont have time to run multiple passes. What should I do?
If you plan to create an
output file that is exactly the same format as your input file you should:
- Define your input layout
specifying all address lines etc. as usual.
- Create an output format in
which your original address lines are substituted with a Formatted
Address
- Set the Success Criteria
code as Postcode, which should be to Delivery Point Suffix level.

- On the Advanced Options
dialog select Accept postcode changes, and Output the input address if matching is
unsuccessful.

- Highlight the output
postcode field and click Advanced to set the source as Matched Input
Floating

Make any final adjustments
to the Job File and run the batch. Matched addresses will be replaced along with
postcodes. Unmatched records will have the original address carried across to the new
Formatted Address lines.
Troubleshooting
The Test Read button reported an
error. What can I do?
If Test
Read reported a problem, first check that the format that you have specified, agrees
with the actual file layout. At a basic level you could start this by clicking on the View
File button and simply checking the layout visually and counting fields. You could also
try the Analyse File button to see if the format is in fact
consistent but perhaps not as you have specified it. If this fails you may need to go to
the data file itself and check the format using some other method. If your file is of the
CSV type, perhaps a variable length Comma delimited structure, make sure that your fields
have got double-quote field qualifiers. See I
have had problems processing my CSV file, are there any known pitfalls with this type of
file?
Some successfully coded records have
only partial output addresses, or incorrect building numbers. Why?
If you are outputting new
replacement addresses, you need to make sure that you have set the Batch Success level
correctly. By default Matchcode sets the Batch Success to postcode level in a new Job
File. As most postcodes, and in particular consumer address postcodes are shared by
several addresses. Matching to this level does not guarantee a full address match. You
need to set the Batch Success level to Postcode in the top half of the dialog box, and
change it from Postcode in the bottom half of the dialog to Delivery Point Suffix. A
Delivery Point Suffix or DPS relates to individual addresses or delivery points. And so to
achieve a DPS level match, Matchcode must find the whole address. Once this has been
changed, re-run the batch and you should see that the bad matches are now marked as
rejects.
I seem to be getting a very low
match rate on what I think is a reasonably good file. What can I do?
Once you have confirmed
that your layout is correct and that Matchcode is using all of your address lines
including any company names and postcodes. You should check what Advanced settings you are
using. One of the ones that you need to check in particular is Allow Postcode
Changes. By default this tick box is not checked. You should change it by putting a
tick in the box. Without this box checked, Matchcode will reject all records where it
cannot agree with any original postcodes that you may have. By making this change and
re-running the batch you are allowing Matchcode to supply postcodes even if they are
different from your original ones.
I have examined the output file from
a business file and some of the company names are missing. Where are they?
Providing that you are sure
that your match levels are set correctly then it is probably safe to say that the records
in question are sole-trader type addresses. What happens is that Matchcode safely Matches
the address you have provided, but on PAF the address is a private residential property.
Perhaps operating as a business. Consequently there is no business name on PAF so
Matchcode can only return the private address. The solution to this problem is to separate
the output company name from the address so that this situation can easily be checked for,
after the batch. To do this you need to create an output format that starts with the PAF
Address element Organisation and is followed by Formatted Address lines, which exclude the
Company name itself.
In this way, rather than
producing for example a 5 line Formatted Address, you produce a 4 line Formatted Address
plus a company name in a field of its own. Later you can check to see if the company
name field is populated, and if not, populate it using the original company name from your
input address. This can be achieved quote easily using a database package.
To exclude the company name
from a Formatted Address, simply click on any one of the Formatted Address fields in the
Output Format tab and then click on the Advanced button to the right. In this dialog you
have various options, one of which is to remove the organisation name from the address.
My building number ranges used to
have Hyphens between the numbers, now they dont, what can I do?
Punctuation characters such
as these are removed from PAF and so number ranges such as 5-7 THE HIGH STREET become 5 7
THE HIGH STREET. However this problem can be overcome by using Advanced Address Formatting
My batch has finished with a message
indicating that it didnt find as many records as I specified. What has happened?
Sometimes a batch may
finish before it reaches the number of records specified in the Job File. Even though
count records was used in the first place. This usually means that you are dealing with a
CSV type file that has fewer fields than you specified in your input format. To check this
try Analyse File and Test Read to see
if they confirm your specified layout.
My output records are fixed length.
I have specified them correctly but Matchcode insists that there is a problem. Why?
Some systems when producing
fixed length records, do not pad out the last field to a consistent length. Effectively
making the file variable length. Try specifying your last input field as being variable
length rather than fixed. This may cure the problem.
The Town field in my formatted
address is in upper case. Can I fix this?
Yes, simply go to the
output format tab of your Job File. Select any one of the Formatted Address lines and
click on the Advanced button. Click on Format the town in upper and lower case.
My file of business addresses is
coding very poorly, what can I do?
Make sure that you have
specified the company name as Address Line 1. Matchcode needs to see all available address
information to get the best and safest results.
I have a file of mixed addresses,
business and consumer, but only the business records are matching. Why?
Make sure that you
havent set Organisation Key as one of your match criteria. Doing so would force
Matchcode to only flag as successful, addresses where it could assign a positive
Organisation Key. This would mean that any Large User or Residentail type addresses would
be rejected as neither of these types ever have Organisation Keys.
Alternatively, if you have
the commercial data set BUSINESS.PAF make sure that it isnt the only PAF being used.
I want to examine my uncoded records
using Interactive. But Interactive complains about the output file that batch created!
Batch processing can mess
up an output file format if you started with an input file that had Double-Quote field
qualifiers which you forgot to put back into the output file. In other words a perfectly
well formatted input file can become a badly formatted output. This happens because
Matchcode removes the double quotes as it handles each record. Consequently you need to
make sure that all output fields are Double-Quoted. This can be done from the Output
Format tab of Job Control.
Matchcode Fields
(Not all fields are covered
in this section)
Input Fields
User Field
User Field is one of the
available input fields. It should be used to refer to any input field or fields that
Matchcode does not use. If the input field cannot be found amongst the other available
fields, such as Input Address Lines then it should be referred to as a User Field.
So typically things such as
Unique Reference Numbers, Telephone numbers and other contact specific details would be
referred to in this way.
Input Addr Key and Input Orgn Key
These are used if the batch
is of the Address Generation type. See. Address
Keys & Organisation Keys
Address Line (1-7)
These fields are used to
specify any of the Address or Company name fields. Postcodes can either be specified as
address lines, or as Input Postcode if they are in a field f their own
Input Postcode
If the input file has a
specific postcode field, then this field should be used to represent it.
Elector Fields
If Electoral Roll data is
available and the input file has peoples names on consumer type addresses then these
fields can be used to specify the name field(s). If the entire name is held within a
single field then Elector Name should be used. The Elector Name field can also be used to
specify surname.
Output Field Groups
Input
Fields
These are the fields as
specified in the input file.
PAF
Codes
Postcode and Postcode Type
The PAF codes are, as the
name suggests, taken directly from PAF. They include postcode. There is also a field
called Postcode Type. This field indicates the address type (L=Large User: S=Small User:
=Unclassified).
Outcode / Incode / DPS Code
These are the 3 parts of
the postcode split out.
AB10 1AJ 1A
AB10 = Outcode
1AJ = Incode
1A = DPS Code
Address Keys & Organisation Keys
These two 8 character codes
can be assigned to addresses that have been matched to Delivery Point level. They are
generally used together. All properties that receive mail should have their own Address
Key. Unlike postcode, which can change from time to time, the Address Key should remain
static. In some cases, several delivery points share a single address key. This can happen
when several business are operating from a single building. To see an example of this,
in Matchcode Data Capture select Search|Address Key Lookup..., enter the following Address Key, and press Find
Address Key = 29402275
It will return a small list
of companies each operating from
S G HOUSE
41 TOWER HILL
LONDON
EC3N 4DU
Each of these organisations
share the above Address Key, but the each have their own Organisation Key and their own
DPS. If you then remove the postcode from the address elements and browse (F5) on the address itself, you
will notice that another organisation is added to the list. This organisationm, as you will notice, has its
own postcode and address key. If you look at the address key you will see that it starts
with the number 6. This indicates that this is a Large User.
Because of their constant
nature, Address Key and Organisation Keys are useful as part of a dedupe.
Also if a certain large
file of addresses need to be regularly updated. And an element of manually processing may
be part of this process then it may be useful to attach these keys to the output file. In
this way, next time the file is processed, an Address
Generation job could be used to generate new postcodes etc for the file without
requiring manual processing again.
By examining the Address
Key and Organisation Key it is possible to get some idea of the address type.
IF THE ADDRESS KEY STARTS
WITH THE NUMBER 6 OR 7
THEN THE ADDRESS IS A LARGE USER
ELSE
IF THE ORGANISATION KEY > ZERO
THEN THE ADDRESS IS A SMALL USER ORGANISATION
ELSE
THE ADDRESS IS UNREGISTERED AND MAY BE A RESIDENTAIL
ADDRESS
ENDIF
ENDIF
PAF
Address
These are the raw PAF
elements and can be used to create a structured address where no formatting of field
concatenation takes place.
Formatted Address
This group contains the
formatted version of the PAF address where elements such as Building Number and Street are
concatenated together into a single field. Up to 7 fields can be used and either basic or Advanced Address Formatting can take place to
structure the address.
Special
Fields
This group contains various
fields such as date and record sequence number but it also contains the flags that
Matchcode can output to indicate matching success level etc. See: Process
Code and Non-PAF Status and AddrFrmt
Status
RCDB
Fields
This group contains various
other address related codes such as O/S Grid Reference Codes.
Elector
Fields
This group contains
Electoral Roll names elements etc.
Matchcode Flags
Process Code
The Process Code can be
found by going to the Output tab of an open Job File. Clicking on the Add button and the
selecting it from the Special Fields group. It is a single character which will indicate
one of four states for each record. These states are
| Automatic |
The record has been automatically
matched to the requested level. |
| Manual |
The record has been saved manually
using the Interactive software. |
| Uncoded |
The record has not been
automatically matched to the requested level. |
| Foreign |
The record is a suspected foreign
address. |
Although by default these
codes are represented as A/M/U/F they can be changed by clicking on the Advanced button of
the output tab when Process Code is selected.
Non-PAF Status
This flag should be used
whenever Non-PAF elements are retained. It is a single character and contains a number. It
can be interpreted as follows:
| 0 |
No Non-PAF elements were found |
| 1 |
Non PAF element found and
successfully retained |
| 2 |
Non PAF element found but could not
be successfully retained |
| 3 |
A combination of 1 & 2 |
If we combine its use with
the option not to flag PNR localities, we can then use this flag to help us to examine the
output data in a more focused way. Concentrating on truly unrecognised elements that were
kept. These will be found in records flagged with either 1 or 3.
AddrFrmt Status
This flag should be used
when the Advanced Address Formatting is used.
It will contain a number which is made of a combination of numbers that represent the
methods used within the address. The basic numbers are as follows:
| 0 |
No formatting required |
| 1 |
Abbreviation took place |
| 2 |
Concatenation took place |
| 4 |
Truncation took place |
| 8 |
Field elimination took place |
Example (1)
Abbreviation + Truncation +
Field Elimination = 1 + 4 + 8 = 13
Example (2)
Abbreviation +
Concatenation = 1 + 2 = 3
Glossary
Advanced Address Formatting
Matchcode has advanced
address-formatting features, which can be used to format address in a more controlled way.

Some of the things that can
be done with the advanced formatting are as follows:
- Hyphens or Slashes can be
put into number ranges.
- Words can be abbreviated to
shorten field lengths.
- The word case of each
address element can be controlled.
- The field position of each
address element can be controlled.
- Field concatenation can be
controlled.
- Fields can be eliminated in
any order of priority to save space.
These and other features
within the formatting can be used to write addresses into very limited address lines. Or
when a more ordered Formatted Address is required.
To use Advanced Formatting
- Make a local copy of
ADDRFRMT.INI. This can be found in the installation directory.
- Open the required Job File
and go to the output format tab.
- Click on any one of the
Formatted Address lines.
- Click on the advanced
button.
- In the dialog box click on
Use a configuration file for formatting options
- Click on the Configuration
File tab.
- Run the Address
Formatting Configuration tool from either the Run menu or from the Matchcode group
- Browse for a select your
ADDRFRMT.INI file in the browser provided.
- Select the number of
Formatted Address lines to use
- Click Change
Settings
Analyse File
The Analyse button on the
input format tab of an open Job File requests that Matchcode read the file and produce a
simple report showing such information as field and record delimiters, maximum record
length etc. Also it will show if the file dies not have a consistent format.
Test
Read
The Test Read button causes
Matchcode to use the specified input format layout to read the file as though it were
running the batch. This is a way of simulating the read part of the batch to provide an
early warning of any potential format errors.
Success Criteria
Perhaps the most important
subject to understand when processing address files is the Success Criteria. This is
because failure to set this correctly and unreliable results may occur.
Address Generation
The term Address Generation
refers to using the Address Key and Organisation key to retrieve a single address from PAF
without need for Cross Matching of the address itself. This can be done manually in
Interactive mode or as a batch by specifying the position of Input Addr Key and Input Orgn Key in the
input file and setting type of batch to Address Generation on the Batch Processing tab of
Job Control.
Electoral Roll Names
If the Electoral Roll
(Names) files are available then this can assist the Cross Matching process. Depending on
set up, Matchcode uses any available names in the input address to disambiguate otherwise
ambiguous addresses.
An example of where this
may be useful is where an input record uses a building name, which is unknown to the Royal
Mail. Lets say the input address is something like this:
MR J SMITHERS
DUNROAMIN COTTAGE
LITTLE AVENUE
SMALLTOWN
SOMEPLACE
Now lets assume that LITTLE
AVENUE has ten properties on it. These are numbered, predictably 1 to 10. There are no
registered building names. Now without the benefit of Electoral Roll names it would be
impossible to do more than assign a postcode. The address itself would remain unmatched.
However, if Matchcode is using Names it will examine a list of all the people in LITTLE
AVENUE and if it finds a good match to MR J SMITHERS, it will return the address where he
lives. Lets say number 7.
If we combined this method
with Add Non-PAF elements to the
output address, and outputting names, we could create an output record like this
MR JASON SMITHERS
DUNROAMIN COTTAGE
7 LITTLE AVENUE
SMALLTOWN
SOMEPLACE
Note that we were able to
provide the full name. Retain the unmatched building name and add the missing building
number.
Add Non-PAF elements to the output
address
This option is enabled from
the Advanced Cross Matching Options of the Batch Processing tab of an open Job
File.
This option instructs
Matchcode to keep, where possible, unmatched input address elements. Any information that
Matchcode did not match during the Cross Matching is put back into the output address if
space is available. This may include building names etc. It is advisable to run tests
before committing oneself to retaining these unmatched elements to avoid unwanted results.
FOR
- Preferred address details
such as house names and localities can be put back into a clean version of the address.
- Extra information such as
department names may be retained though not actually on PAF.
AGAINST
- Unmatched parts of the input
address, which may not be desirable to keep, may be retained, e.g. ** UNPAID ACCOUNT **.
- Matchcode may infer what it
considers to be missing address elements, but are in fact just badly spelled and as a
result we could get elements repeated. SURBITON GROVE, ZERBITTEN GROWVE for example.
We have some control over
how Matchcode deals with PNR (Postally Not Required) Localities. These are locality names
that are commonly known and used but not recognised by the Royal Mail, and therefore PAF.
These are known to Matchcode and may be used to assist matching but play no part in the
output. However we can specify whether these elements are retained and also whether
Matchcode flags them as Non-PAF elements. If we choose to use the Non-PAF option we should
include a flag to show what, if anything, has taken place. This flag is known as Non-PAF Status.
Multiple Passes
A strategy of multiple
passes is a good way to get the best results from Matchcode. Effectively we use Matchcode
to filter out records with each pass through the file. We should start by setting the
highest possible levels of match, and gradually remove successful records, creating
ever-smaller residue files, which, are processed using reduced levels of matching. Each
level should be indicated in the output file by a change of Process
Code so that when all possible passes have been completed and the resulting files
merged back into a single file we could easily identify the match level of each record.
What to do after the first
batch has finished
Once a pass has been
completed, the files can be split ready to go through the next pass. This is achieved by
selecting Extract Records from the File menu of Job Control. This will bring up a dialog
box. In here we can first select Automatically Coded records from the Output file, and
write these to a new file. These successful records can be set aside until we are ready to
merge the output data. Next, we repeat the process, selecting all other categories
(Uncoded / Foreign / Manual) from the input file. This file will become the input file for
a new, smaller batch. All we need to do then is to take a copy of the Job File using
Windows Explorer or some other software. Then we edit the parameters of this new job file
to:
- Change the input and output
filename to refer to the new residue file.
- Re-count the records.
- Make changes to the match
level and parameters to apply to the new batch.
- Change the Process Code to indicate the new match level.
For example if we have
access to Electoral Roll Names data we may carry out
the following levels of matching.
| Pass-1 |
Name & Address matches only
Process Code of N for successful records. |
| Pass-2 |
Address matches only Process
Code of A for successful records. |
| Pass-3 |
Postcode matches only Process
Code of P for successful records. |
So our output file will
contain the following Process Code flags N / A / P / U
Obviously by the time we
are looking for postcodes only we would have to retain our original addresses rather than
replacing them.
Fuzzy Matching dialog box
When you open a saved Job
File and go to the Batch Processing tab you will see the Fuzzy Matching button. By
clicking on this button you will bring up a dialog box in which you can control some of
the ways Matchcode searches for your addresses.
The dialog box is split
into 3 sections. These are General, Max Towns and Number Matching.

General
Orthographic is the basic
kind of fuzzy matching which Matchcode uses. It allows for such errors as character
transposition and missing or duplicated characters. In most cases this is probably the
safest option to use when processing business addresses.
Advanced Fuzzy Matching
does everything that Orthographic does but in addition it allows for missing or extra
words, incorrect word order etc. It is a more heavyweight form of fuzzy matching and as
such it will probably increase the number of successful matches. It can also increase the
number of mismatches as more differences between input and output are allowed.
Check For Foreign causes
Matchcode to search an internal list of known foreign town if a record fails to match. If
the address is believed to be foreign then it is flagged accordingly
Employ User Codes See
Manual for details
Max
Towns
Settings this option to 2
or more causes Matchcode to continue searching when an initial town match fails to yield a
successful match. This is particularly useful in situation where an address could contain
more than one viable town name. This can occur when Localities have been promoted into
Towns, or when records contain such things as Near Guildford.
Number Matching
By deselecting these
options you can limit the amount of tolerance Matchcode applies when matching building
numbers.