Fun with File Types

Jun 17, 2013

We use files on our computers every day. We open them, save them, edit them, and move them around, but what do we really know about our files?

A file’s name consists of two basic parts – the actual name of the file (“My Cool Blog Post”) and the extension (“.docx”). The name is how you identify the file’s contents, while the extension tells your computer what program to use to open your file. Your computer remembers what program you have chosen to associate with a given extension. For example, an extension of .doc or .docx, while often associated with Microsoft Word, can also be associated with other programs such as OpenOffice Writer and AbiWord. Files with an extension of  .xls or .xlsx may commonly be associated with Microsoft Excel, but again you do have other options such as OpenOffice Calc. Notepad is often associated with .txt files (these are plain text files, either comma or tab delimited), but this extension can be opened with many other programs. Adobe Reader may be what you think of when you see the .pdf extension, but you can associate even this with another program such as Google Chrome if you desire. In the case of .pdf and .txt, the extension has become a way to refer to the file itself (“I need it in a pdf.” or “Can you send me that text document?”). If you are thinking, as you read this, “I don’t see any extensions when I look at my files,” this is because Windows hides file extensions by default. You can very easily change this setting.

The extension not only tells your computer what program to associate with your file, it also controls how the file is formatted upon save (behind the scenes), which in turn controls what the associated program expects to find when it attempts to open the file. One of the fastest ways to render a file unusable is to change its extension by right clicking on the file and choosing “Rename”, rather than by opening the file and performing a “Save As” action where you can select another Type (extension). If you just change the extension, Windows will try to save you from the mistake by giving you a warning.

An exception to this rule that ImportOmatic clients may encounter when importing to Blackbaud’s The Raiser’s Edge 7 is when a time/date stamp has been appended to the file name after processing (Ex: MyDataFile.csv.20130523_120456). In this case you may simply remove the time/date stamp by using the right-click Rename function (Result: MyDataFile.csv), ignoring the Windows warning.

What if we want to open our file with a different program than the one it is currently associated with on our computer? For example, opening a .csv file in Excel will make it much easier to read and edit than opening it with Notepad. Fortunately, there is overlap in file formatting that sometimes allows files to be successfully opened by programs other than the ones your computer currently associates with their extensions. To do this generally requires extra steps because if you simply double click to open the file, your computer will use the associated program. To open a file with a different program, first open the desired program, and choose the ‘Open’ menu option and browse to the desired file. Depending on the formatting of the file you are attempting to open, you may also have to tell the program how to handle it. This is the case when you use Excel to open a tab delimited text file. Excel will present you with a wizard; when it asks for the delimiter choose “Tab.” Some formatting may not carry over when you open a file in a different program than the  one used to create it. If you attempt to open a file with a program that is incompatible with the file’s formatting you will either receive an error, or when it opens it will be a mess of scrambled letters, wing-dings, and control characters.

As with opening a file in a different program, when you save a file as a different type than the default for the program you are using, some things may not carry over to the resulting file. For example, when you save a file as .csv or .txt from within Excel, most formatting such as highlights, locked columns, text colors, and inserted tables, is lost because these are text-only file types. 

One of the other things to be aware of when saving a file as a .csv from within Excel is the handling of “long numbers”. Excel changes numbers it considers to be “long numbers” to Scientific Notation (Ex: 9.23E+11) in some file formats. If you search the Internet for “Excel Scientific Notation”, you will get back many results related to working around this behavior! The important thing to note about this behavior is: If you see the cells displaying in Scientific Notation and you SAVE the file, Excel will permanently change the actual data to be in Scientific Notation. Generally, expanding the columns so that the numbers display properly before saving will prevent this – but not always. If you find this behavior to be a problem, you can try saving as a tab delimited text file instead. This will sometimes get around the issue. 

Editing your files can also pose challenges when the editing is performed in a program other than the one associated with the file’s extension. If we go back to the example of opening a .csv or .txt file using Excel, there are some things to keep in mind as you edit your file. In Excel, the option to “clear cells” and the option to “delete cells/rows” do not produce the same results. When you clear cells, Excel holds on to the memory that you had used the cells previously, whereas deleting removes them entirely. This memory of the other cells can cause errors in ImportOmatic that indicate your data file has more columns than your profile. If this happens, the easiest solution is to select only the cells and rows that contain data, copy them to a new sheet, and save the file with a different name.

Regardless of how you create, access, or change your files, the best piece of advice I can give you is to maintain a backup copies of your original files. To paraphrase an IT adage: Only have backups for files you can’t afford to lose. I hope this primer has helped you gain a better understanding of your files!

Omatic Software
Omatic Software is dedicated to integrating disparate systems and democratizing data access for today’s nonprofits. Founded in 2002, Omatic has worked with thousands of nonprofits globally to remove their data barriers by integrating systems and enabling nonprofit teams to leverage their donor data rather than be burdened by it. The Omatic team has one goal – unleashing the power of data to show a complete view of your donor, enabling data-driven decision making and opportunity creation for your organization.