0

I am using Visual Studio C++ 2017 Professional and MFC


I am working on a project which has a function that collects file paths (like C:/foo1/foo2/foo3.txt) as strings, if a file with that path exists (found using the filesystem library).

At first this looks straightforward until I see that the file paths often are simply (for lack of a better word) templates.

My program is given a template file path such as this: C:/User/Documents/%A/%B.txt, where %A represents a year (which my program has a range for and iterates through each year, comparing to the template folder) and %B is the month, again iterating.

It only gets more complicated, because we now add two more template symbols: * and ?. For example: C:/User/Documents/*_%A_??_*/%B_*.txt.

Here, * represents 0 or more characters and ? represents one character. So in the example, a filepath like: C:/User/Documents/smile_2022_A1_/07_ABCDEFGHIJKLMNOP.txt should be found and saved as a string.

I am able to separate the file path via tokenization, meaning I already have a function that collects a vector filled with, say, C:, foo1, foo2, foo3.txt. With this I can iterate through the vector either with a loop or recursively enter a folder until I either:

  1. reach a deadend (aka the folder or file does not exist)
  2. reach the file I want and save its entire path as a string

One thought I have come up with is look for _ symbols and separate the folder or file name with those but this gets countered by:

  1. if the * or ? contain a _
  2. if the folder or file name simply do not contain _

The solution I am looking for ideally does not utilize regex. However, if that is the only solution (obviously I do not want to recreate regex), please advise me on how I can use it on varying path complexities, as my function should allow for both regular folder paths without template symbols and those with them.

The reason I say no regex is because from my understanding, it is a little more strict when it comes to comparing a file. It expects the string to be written in a certain generic but consistent way. However, in my case I cannot expect a user to name their folders in a consistent manner...

11
  • If you already have the template path tokenized, have you considered just creating a TokenMatcher-like class that evaluates for each token that it matches the template? If you create a vector of these (1 per token), then when walking all files in a directory, just check that each matcher evaluates true for a given path to find a file path that satisfies the template
    – Bitwize
    Commented Aug 1, 2022 at 17:01
  • @Human-Compiler do you have a source on this ? my first time using tokens and I tokenized the string very simply by separating by the char \ Commented Aug 1, 2022 at 17:34
  • @franji1 thank you I will test this out right now and update Commented Aug 1, 2022 at 17:34
  • @franji1, I doubt if these APIs will work if they contain wildcards in the directory path, I think only the filename part can contain wildcards. The application should check for the first wildcard, and if this is contained in the directory path, it should be searched with FindFirstFile/FindNextFile (searching for directories only), and for each directory found call the algorithm recursively (for the rest of the string). Commented Aug 1, 2022 at 17:39
  • 1
    The program stops operating properly as soon as a data point is introduced that falls outside the range 1900-2100. Software is expensive and has a tendency to be in operation way beyond its intended life span. Besides that, the approach is inherently inefficient: You're enumerating all patterns, and for each pattern you are hitting the disk. That's 2400 patterns (200 years with 12 months each). Enumerating all files once, and matching the path names against a pattern causes far less I/O traffic. I/O is expensive. Commented Aug 2, 2022 at 11:18

1 Answer 1

1

Once you replace YOUR env variables (%A, %B, etc.), then FindFirstFile, FindNextFile API calls support wild cards.

6
  • after some quick preliminary testing, this is working exactly as desired, will update if any changes, thank you again :) Commented Aug 1, 2022 at 18:11
  • a follow-up question - if either of the FindFirstFile does not succeed, do I still call the FindClose function to close the handle ? @franji1 Commented Aug 1, 2022 at 19:44
  • 1
    From the online documentation (learn.microsoft.com/en-us/windows/win32/api/fileapi/…)... "If the function fails or fails to locate files from the search string in the lpFileName parameter, the return value is INVALID_HANDLE_VALUE and the contents of lpFindFileData are indeterminate." Hence, you should NOT call FindClose w/INVALID_HANDLE_VALUE. I would also assume a NULL/0 value is failure also, so don't call FindClose w/0.
    – franji1
    Commented Aug 1, 2022 at 19:49
  • 1
    The online topics point to an example: learn.microsoft.com/en-us/windows/win32/fileio/…
    – franji1
    Commented Aug 1, 2022 at 19:51
  • 2
    Clean-up functions can't fail because, well, how do you clean up from a failed clean-up? If FindClose fails then it does so due to a bug in your program. You can log the error, but beyond that there's not much you can do. RAII is a powerful tool in preventing those bugs. Create a wrapper around the HANDLE, initialize it from the c'tor or throw if it's INVALID_HANDLE_VALUE. If you throw, the d'tor won't run, cutting out this failure mode altogether. Commented Aug 2, 2022 at 11:08

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.