How Property Records Are Merged

For each record we collect, we generate 1 or more keys for the record. Each key value is based on different unique identifiers that are available from the record's data. If we see a different record with 1 or more of the same keys values, we will merge these two records.

For example, we may generate a property record like this when crawling a web page:

{
  "address": "123 Anywhere St",
  "city": "Austin",
  "province": "TX",
  "country": "US",
  "numBedroom": 3,
  "numBathroom": 3
}

This record will generate the following keys:

"keys": [
  "US/TX/Austin/123AnywhereSt"
]

Let's say we then crawl another web page for the same product and generate this data:

{
  "address": "123 Anywhere St",
  "city": "Austin",
  "province": "TX",
  "country": "US",
  "neighborhoods": [
    "Rolling Hills",
  ]
}

This record will generate the same keys value as the previous record, so the two records will be merged together. The resulting record will be:

{
  "address": "123 Anywhere St",
  "city": "Austin",
  "province": "TX",
  "country": "US",
  "neighborhoods": [
    "Rolling Hills",
  ],
  "numBathroom": 3,
  "numBedroom": 3
}

Property records use the following fields to generate keys:

  • address
  • city
  • country
  • province
  • taxID
  • mlsNumber

taxID and mlsNumber are used in conjunction with province.