I am working with community-level data as well, and without names it is especially difficult to identify duplicates, particularly when there are spelling errors, nicknames, or incomplete information. Here are some lessons I have learned that may help ...