Q: Is it true that if my Puerto Rican address records have the Urbanization, I should have little trouble standardizing and geocoding them?

A: No. This is one of the most widely held myths about Puerto Rico. The Urbanization is an important part of many Puerto Rican addresses. However, it is only one of numerous differences between Puerto Rican address records and those of the 50 States.

Q: Puerto Rican addresses within my data are truncated or abbreviated. Can SeekData still standardize and geocode these addresses?

A: Yes. Truncation is a very difficult problem because the Puerto Rican address parts are entered in no particular order along with inadequate spacing available on information forms. Thus, these truncation issues ensue. It cannot be assumed that the majority of the truncations will be to the urbanization. In fact, often the urbanization is the first part of the address. Even minor truncation can leave an address uncodeable. SeekData has designed data sets, complex algorithms and queries in order to successfully resolve this problem.

Q: Can SeekData standardize and geocode addresses that contain a street or urbanization name that may be written in many different ways?

A: Yes. SeekData has developed an Alternative Name Table called ANTA which cross-references thousands of names to solve this problem.

Q: Letters and numbers appear in the same parsed section of a Puerto Rican address. Can this type of address be geocoded?

A: Yes. Many accepted abbreviations are used in Puerto Rican addresses such as "C" for Calle (street) or Carretera (state or county highway). This would not normally be an issue if not for the common usage of alphanumeric house numbering. One example is "C127", which may very well indicate either the house number itself or Carretera 127, which is quite different. Extensive research and specially designed data sets, algorithms and queries make it possible for SeekData to geocode those types of records.

Q: Does SeekData use GPS?

A: Yes. GPS not only gives SeekData ground truth (sub-meter), but it can be used to verify data accuracy. In the field, we employ our teams with state of the art Trimble® products using Microsoft Windows CE® and Terrasync™ software. That data is then introduced into our unique Puerto Rican geocoder.

Q: My Puerto Rican address data is in Spanish and contains hyphens and other punctuations imbedded within the address. Should I convert to English or clean up the punctuations prior to submitting it to SeekData?

A: No. Puerto Rican addresses that are entered using one character set (Spanish) and then converted to another non-Spanish or straight ASCII are much more challenging. At times the conversion will delete special Hispanic characters, cause un-natural spaces or change certain characters to completely different characters. Files that originally contained punctuation and then have been subsequently cleansed of punctuation can become much more difficult to parse correctly. Punctuation can be a very important part of the Puerto Rican address. It would always be preferable to receive the data "as is" then to have it converted or cleansed in any way.

Q: Can SeekData geocode rural Puerto Rican addresses?

A: Yes. Puerto Rican mail is delivered by the USPS. As in the 50 States, some rural areas do not receive door delivery. Generally, unlike the U.S., Puerto Ricans will almost always state their rural city-style street address over a Rural Route, Highway Contract or a Post Office Box Address. SeekData maintains and utilizes data sets that cover the majority of these rural areas.

Q: Can SeekData geocode my Puerto Rican address data when some addresses contain KM and HM markers?

A: Yes. Commonly about 20% of a Puerto Rican address file will consist of records containing carreteras, kilometers (KM) and hectometer (HM) markers. Many of the rural type addresses are defined by a carretera designation accompanied by kilometer (KM) and hectometer (HM) markers. SeekData has produced a specialized data set that has cross-referenced all carreteras with KMs and HMs. Directionally, KMs are posted going both ways, so other designations are used to determine which direction determines the actual geographic reference assigned.

Q: Some of my Puerto Rican address records have only building names as their addresses. Can these records be standardized and geocoded?

A: Yes. Condominios (condos), apartamentos (apartments), residencias (residences), cooperativas (co-ops) and proyectos (projects) - many Puerto Ricans refer only to the building or complex name when asked their address. Similar to the 50 States such as 8903 Presidential Parkway is also known as Washington Plaza 1. Some of these addresses are referenced in the USPS file but most are not. SeekData has compiled a data set of all known condo, apartments, buildings, residencias, cooperatives and projects. Other descriptors such as the edificio (building) number, apartamento number, proyecto number, building name, tower or torre number and street may be required to assign the correct geocode to these addresses.

Q: Similar to the 50 States, Puerto Rico is constantly changing. New residential and non-residential areas and roads are being developed on a daily bases. How does SeekData keep "up-to-date"?

A: With the use of state-of-the-art Trimble® products, SeekData employs research teams that locate new developments across the entire island and gather GPS data on new homes, non-residential buildings, and roads that have been built since January 2000. This is an ongoing effort and vitally important in keeping pace with the thousands of new homes built every year on the island.

Q: If Puerto Rican addresses are so difficult to manage, how is the U.S. Census Bureau able to mail their questionnaires?

A: In the past the U.S. Census Bureau has not used the USPS to mail census questionnaires in Puerto Rico. Instead they have utilized field staff to visit every household (1.3 million) to deliver the census questionnaires. As a result of SeekData's effort, the U.S. Census Bureau will have the option in the 2010 Decennial Census to do a mail-out/mail-back campaign.

Q: Why does the geocoding/standardizing software I use for the 50 States produce such poor results with Puerto Rican addresses?

A: They lack the necessary data, parsing routines, standardizing and matching algorithms. Without specialized parsing routines, many accepted abbreviations used in Puerto Rican addressing, such as 'C127' for CALLE 127 or CARRETERA 127 cannot be parsed from the alphanumeric house number C127. 'RR3' could be Carretera 3, Rural Route 3, or house number RR3. 'ED4' could be EDIF 4, TORRE 4, or house number ED4.

Keep in mind; address elements in Puerto Rico are not always presented in a consistent order. 'C22' at the beginning of an address may not be a house number. It could just as well indicate Carretera 22, Calle 22, or Edificio C22.

When Puerto Rican address data is forced to conform to electronic file formats or paper forms designed for U.S. addresses, inadequate space and fields prevent the address from being fully represented. Key words are often the first thing to go. BO CANTERA becomes CANTERA; CALLE PARQUE DE LA LUNA becomes PARQ LUNA. In many cases a municipio, urbanizacion, residencia, barrio and a street all share the same name and the adjectives that help determine one from another have been omitted.

Q: If addressing in Puerto Rico is so problematic, how does the mail get delivered?

A: In two words: "local knowledge". The mail gets delivered because the mail carrier walks by Roberto Hernandez's house six times a week. He has seen Roberto's address expressed in many ways; in other words he has this acquired local knowledge. The problems surface when you ask a computer to match Roberto's address to some other dataset like a USPS file for address verification or a street address range for geocoding. Also, it is problematic if you're mailing to a list of addresses and need to know if the Roberto Hernandez at 12-14 CALLE 34 SE is the same Roberto Hernandez at CASA 14 C 34 SE BLQ 12. It should be noted that misaddressed mail, other than those at the First Class rate or otherwise endorsed, will not reach the mail carrier and is discarded as waste.

Q: Can I use the TIGER/Line® files to geocode addresses in Puerto Rico?

A: Some of the centerline data for Puerto Rico is fairly good. It was originally derived by digitizing the USGS 1:20,000-scale topographic quadrangles rather then the GBF/Dime files. The address ranges are not so good, and at this time, no references exist for urbanizations making it next to impossible to use for geocoding.

Please read the U.S. Census Bureau description below:

ADDRESS ANOMALIES IN PUERTO RICO
The TIGER/Line files contain some address range coverage for Puerto Rico. However, use of this information for geocoding purposes may be problematic and the data user should proceed with caution. These address ranges are preliminary attempts at using Puerto Rico address ranges in Census Bureau file. At present, there are inconsistencies, overlaps, and duplication of address ranges. Address ranges may lack alpha character prefixes or have hyphenated prefixes. The files also lack the community names used in a four-line address that the U.S. Postal Service requires to avoid duplicate addresses. Errors in the reference files, and other factors may limit the usefulness of this product for geocoding purposes.

Q: How large a market is Puerto Rico?

A: Puerto Rico has one of the largest economies in Central America and the Caribbean with a gross domestic product of approximately $70 billion. Personal income is approximately $41 billion per year. If Puerto Rico were a state, it would rank 27th in population (3,808,610). For detailed economic information visit the Government Bank of Puerto Rico.







   
     
 
© 2006, SeekData, Inc.