VBA function declaration to remove  from the beginning of a file Ms Access Gurus

Help support this site, thank you.

Remove  from beginning of a text file using VBA

At the beginning of a file,  indicates that the file is encoded using UTF-8, one way of encoding Unicode. I couldn't find a connect string parameter (like perhaps CharSet) to handle UTF-8.

As a result, I wrote VBA code to strip the first 3 characters of the file so the first fieldname or line of data doesn't begin with these odd characters.

Some systems use ISO-8859-1 (which is gradually being changed to UTF-8) to encode files. When Access reads them, that beginning marker causes issues either in the first fieldname or first line of data, depending if there are headers (HDR) or not.

UTF-8 BOM can cause issues

Quick Jump

Goto Top  


Download

Download zipped BAS file you can import into your VBA projects for Access, Excel, and other apps when you need it. mod_File_RemoveUTF8bom_s4p.zip

If you have trouble with a download, you may need to unblock the ZIP file, aka remove Mark of the Web, before extracting the file. Here are steps to do that: https://msaccessgurus.com/MOTW_Unblock.htm

Goto Top  

VBA

Standard module

' module name: mod_File_RemoveUTF8bom_s4p
'*************** Code Start ***************************************************
' Purpose  : strip  from beinning of file contents
' Author   : crystal (strive4peace)
' Code List: www.msaccessgurus.com/code.htm
' This code: https://msaccessgurus.com/VBA/File_RemoveUTF8bom.htm
' LICENSE  :
'   You may freely use and share this code, but not sell it.
'   Keep attribution. Mark your changes. Use at your own risk.
'--------------------------------------------------------------------------------
'                              TextFileStripBOM_s4p
'--------------------------------------------------------------------------------'
Public Function TextFileStripBOM_s4p( _
   psPathFile As String _
   ) As Boolean 
'230127 strive4peace
' strip UTF-8 BOM (byte order mark) 
' from beginning of file 

   'Return
   '  False if no change made to file
   '  True if file was changed
   
   TextFileStripBOM_s4p = False 
   
   Dim iFile As Integer _ 
      ,sFileContents As String _ 
      ,s3 As String 
      
   'get a numeric file handle to refer to the file
   iFile = FreeFile 
   
   'open the file for reading
   Open psPathFile For Input As iFile 
   
   'get first 3 characters of file
   s3 = Input(3,iFile) 
   
   'see if there is a marker for UTF-8
   If s3 <>  "" Then 
      'no changes to file
      GoTo Proc_Exit 
   End If 
   'get rest of file
   sFileContents = Input(LOF(iFile) - 3,iFile) 
   Close iFile 

   'over-write file without BOM characters
   Open psPathFile For Output As iFile 
   Print #iFile,sFileContents 
   
   'indicate that a change to the file was made
   TextFileStripBOM_s4p = True 
   
Proc_Exit: 
   On Error Resume Next 
   Close iFile 
   Exit Function 
  
Proc_Err: 
   MsgBox Err.Description _ 
       ,, "ERROR " & Err.Number _ 
        &  "   TextFileStripBOM_s4p"

   Resume Proc_Exit 
   Resume 
End Function 
'--------------------------------------------------------------------------------
'                              testTextFileStripBOM_s4p
'--------------------------------------------------------------------------------'
Sub testTextFileStripBOM_s4p() 
'230127 s4p test TextFileStripBOM_s4p

   'CALLs
   '  TextFileStripBOM_s4p
   
   Dim sPath As String _ 
      ,sFile As String _ 
      ,sPathFile As String 
   
   sPath =  "C:\myPath"
   sFile =  "Filename.csv"
      
   sPathFile = sPath _ 
      & IIf(Right(sPath,1) <>  "\", "\", "") _ 
      & sFile 
   
  ' Call TextFileStripBOM_s4p(sPathFileIn, sPathFileOut)
   MsgBox TextFileStripBOM_s4p(sPathFile),, "Done"
End Sub 
'*************** Code End *****************************************************
' Code was generated with colors using the free Color Code add-in for Access.

Goto Top  

Reference

Microsoft

FreeFile function

Open statement

Input function

Print # statement

LOF function Length of File in bytes

Wikipedia

UTF-8

ISO/IEC 8859-1 gradually being changed to UTF-8

YouTube video explaining UTF-8 encoding

Characters, Symbols and the Unicode Miracle - Computerphile (9:36)

Goto Top  

Backstory

I'm writing a program to loop through a folder and get data from CSV files And then, I ran into this! Three weird characters at the beginning that had to go!

If there's a way for Access to interpret the file as UTF-8, I couldn't find it. Anyhow, none of the contents requires Unicode, so the best option seemed to be to simply remove the beginning marker characters. Actually, it is just one Unicode character, U+FEFF. FEFF is decimal 65279.

If you like this page, please let me know. Donations, big and small, are much appreciated! They all say Thank you.

Share with others

Here's the link for this page in case you want to copy it and share it with someone:

https://msaccessgurus.com/VBA/File_RemoveUTF8bom.htm

or in old browsers:
http://www.msaccessgurus.com/VBA/File_RemoveUTF8bom.htm

Get Help with Access and converting data

Let's connect and team-develop your application together. I teach you how to do it yourself. My goal is to empower you.

While we build something great together, I'll pull in code and features from my vast libraries as needed, cutting out lots of development time. I'll give you lots of links to good resources.

Access is great at converting and managing data from all kinds of different places. Access is access! I'd love to help you. Email me at training@msAccessGurus.com

~ crystal

the simplest way is best, but usually the hardest to see

Goto Top