Jump to content

All my products and services are free. All my costs are met by donations I receive from my users. If you enjoy using any of my products, please donate to support me. Thank you for your support. Tom Speirs

Patreon

FatMatch Version 0.3 - March 01 2009 (beta)


Fatone85

Recommended Posts

  • Replies 66
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

An odd coincidence struck me this morning.

Stu sent me the source code for his Fuzzy Match application earlier this week and as I went through his code that performs the actual matching, I quickly came to realize that there is virtually no difference between his method and mine.

Some key differences include his handling of numbers. For example his code attempts to match Year strings against shortened forms (1995 as 95 and vis-versa). I'll implement that into the next release.

The only other real big difference is my use of RegEx patterns which I believe speeds up the search process.

I just found it strange how our functions are nearly identical and thought I would share that with you folk.

Cheers

Link to comment
Share on other sites

An odd coincidence struck me this morning.

Stu sent me the source code for his Fuzzy Match application earlier this week and as I went through his code that performs the actual matching, I quickly came to realize that there is virtually no difference between his method and mine.

Some key differences include his handling of numbers. For example his code attempts to match Year strings against shortened forms (1995 as 95 and vis-versa). I'll implement that into the next release.

The only other real big difference is my use of RegEx patterns which I believe speeds up the search process.

I just found it strange how our functions are nearly identical and thought I would share that with you folk.

Cheers

Glad you managed to get some use from my code Fatone85

Stu

Link to comment
Share on other sites

I've been testing this all weekend, and I came across a few bugs here and there. I fixed the ones I found.

I've also tweaked the matching code for better accuracy and added a "Export Copy Script" option which saves a BAT file of all the commands that are run when you do the file Rename process.

So I've released version 0.3 which has all these changes.

Let me know of any bugs you find.

Cheers

ChangeLog:

--------------------------------

Version 0.3 - March 01 2009

--------------------------------

- Tweaked match process

- Fixed list missing images bug (Value 101 Error)

- Fixed match results not displaying proper percentage

- Added "Export Copy Script" Option

Link to comment
Share on other sites

  • 3 weeks later...
Circo,

Where are these on the FTP... I don't see them

Yeah, I dont see them there either as of Yesterday.

Anyway, FatOne85 the forum should be good for the download, but let me know if you need a little web space or FTP. Youve obviously written a nice app here, and stuff tends to get buried a little in the forum.

Im locked into a server contract for 20 months, and although a good deal price wise, its sitting there not doing much at the moment, so let me know. I dunno, it may be enen worth including in the GameEx download, and getting it more exposed that way. Or at least more prevelant somehow.

Link to comment
Share on other sites

Circo is busy with setting up a new ftp. So currently there are two active. They aren't available on the old but but they are on the new one. I'm looking at them as we speak.

Link to comment
Share on other sites

Yeah, I dont see them there either as of Yesterday.

Anyway, FatOne85 the forum should be good for the download, but let me know if you need a little web space or FTP. Youve obviously written a nice app here, and stuff tends to get buried a little in the forum.

Im locked into a server contract for 20 months, and although a good deal price wise, its sitting there not doing much at the moment, so let me know. I dunno, it may be enen worth including in the GameEx download, and getting it more exposed that way. Or at least more prevelant somehow.

Hey Tom.

Thanks for your continued interest in FatMatch. It's an honour to have you mention it.

If you feel that it's worthy enough to be included in the GameEx download, then by all means ... hell yes!

I'm currently working on some added features for the 0.4 release, which will include a CRC matching scheme (this is proving to be pretty handy, and I'll elaborate more on it in a few days).

I've also noted a few requested changes from other posts that I've come across, and I'm working on adding those features as well.

Thanks!

Link to comment
Share on other sites

Hey Tom.

Thanks for your continued interest in FatMatch. It's an honour to have you mention it.

If you feel that it's worthy enough to be included in the GameEx download, then by all means ... hell yes!

I'm currently working on some added features for the 0.4 release, which will include a CRC matching scheme (this is proving to be pretty handy, and I'll elaborate more on it in a few days).

I've also noted a few requested changes from other posts that I've come across, and I'm working on adding those features as well.

Thanks!

Fatone,

I just tried it and it works very easily. I'm not a programmer, or anything, so I am the perfect 'idiot' test subject.

My only issue - Out of a folder of 19 box images it picked up 1.

Now, these are raw, untouched images, meaning, not using Good, Tosec, or any other naming software. These are from websites, and just a right-click and save by me.

Here is what I have:

example

-------------

rom name: Escape (USA, Europe)

original image name: emerson_arcadia_escape_box

did not find

rom name : Tanks a Lot (USA, Europe)

original image name: emerson_arcadia_tanks-a-lot_box

found and renamed correctly

So I don't know what it was able to see in the 'Tanks a Lot' name, that it missed on the Escape name. The 'Tanks a Lot' name was the only one it saw and successfully renamed. May have it ignore '-' and '_' as well, as many of the images on websites out there have those characters.

It's a great little program, and I am sorry I found it so late, it would have saved me a ton of time. Keep-up the great work!

IMBerzerk - from renaming all these $#%@ images!!!!!

Link to comment
Share on other sites

I've used this program also. It seems to have a problem matching certain names. I was using it to match the Emu video clips to rom names. Seemed like some of the roms it just would not match to the videos. The file names would be almost identical, except for an underscore or something. I turned down the match percentage to below 10%, and that allowed it to match up alot more, but some would not match up with some of the roms at all. Human error may have played a part in this problem, as I didn't use it very long. I do plan to continue to use it, as it does make renaming files easier when you have a folder with hundreds of files you need to match with another file name. Great little app !

Link to comment
Share on other sites

FatMatch looks at the file names and calculates a match percentage based on the words in both file names.

In the case of Escape, you have 3 words in the rom name and 4 words in the box image name (assuming the '_' are converted to spaces), but only 1 word in the title. One word matches which means that your match percentage would be somewhere around 25%. Fatmatch has a threshold below which it will assume things are not matched. This is one example of a match that won't work out well for the software.

In the case of the other game, it has 3 words in it's title. So, even if there are other words that don't match, the 3 that do will increase the match percentage enough that it should show up as a match.

I don't know an algorithmic solution for this type of situation. The only thing I can really suggest is making sure there is a way in the software to show lower match files. Perhaps running through the directory once for the low hanging fruit and again for the tougher files? I don't know, it's a tough one...

Link to comment
Share on other sites

Hi all,

Thanks for testing this software and reporting your issues.

In this case, bkenobi is correct.

Using the ultimate algorithm to match words or "Fuzzy Matching" is incredibly tedious.

You have to somehow predict what the user would see and validate as a match if he/she was doing it manually.

One example of a match that would seem very obvious to the human eye - but not so much for the program - is in the case of say "Bomberman" and "Bomber Man".

The application always matches two strings of words. The first being the filename of the ROM, and the second being the filename of the Image.

It first takes the ROM name and splits it into words, and gets a percentage of the match performed. Then it does the same for the Image name, against the ROM name. The result of these two matches yields two values that are weighted at 50% each of the final Match Percentile.

So in the case above. The program takes the ROM name (Bomberman) and splits it into words. Since Bomberman is technically a single word, it starts matching it against the Image name.

When Bomberman is tested against Bomber, it returns a match count of 0 (Since Bomberman in reality is not the same as Bomber). When it is tested against "Man" the same result.

Now that takes care of half the search, where you are still at 0% matched.

Next the program would split the Image name into words and find 2 words (Bomber and Man) and match them individually against "Bomberman". So if you follow me, Bomber IS present in Bomberman, so there's 1 out of 2 words matched. Onto the next word, "Man" IS present in Bomberman, which brings us to 2 out of 2 words correct, or, 100% of the second string tested.

Gathering all the results that we just calculated in the past fraction of a second, we can determine that the first of two searches yielded a 0% match (0% x 0.5 = 0) and the second search yielded a 100% match (100% x 0.5 = 50). Add them together, and you have a final result of 50%. (0% + 50% = 50%)

So you can maybe see how sometimes, the computer is too smart for it's own good. The computer does not know that Bomberman is pronounced the same as Bomber Man when spoken, and therefor doesn't differentiate properly between combined words and separated words.

IMBerzerk, the reason why your search rendered bad results is an error on my part.

For some wonky reason, the portion of code that replaces underscores with spaces was commented out in my code.

I'm glad you pointed this out as it is a potentially crucial mistake.

Thanks a mil!

Link to comment
Share on other sites

You could search for all permutations of a title, removing one space at a time. With a title of "The Bomber Man", do the searches as you have them now, +TheBomber +Bomberman +TheBomberMan. I don't know how realistic this is, or if it creates more problems than it solves...

Link to comment
Share on other sites

You could search for all permutations of a title, removing one space at a time. With a title of "The Bomber Man", do the searches as you have them now, +TheBomber +Bomberman +TheBomberMan. I don't know how realistic this is, or if it creates more problems than it solves...

I can understand how that would seem like the better way to go, and I assure you that this is the way I had originally coded the algorithm, although the problem arose when I got into larger amounts of files.

For example, I have a set of 872 Sega Genesis ROMs and I just downloaded a set of Title images that contained about 1000 images. The average word count of the ROMs filenames is 4. With 4 words in a filename, you have 7 possible neighboring permutations plus the individual words themselves for a total of 11 attempts at matching. The algorithm does this on both strings, so that becomes 22 attempts. This means that each time a single rom filename is matched against the 1000 images, 22000 attempts have been passed through the search code. So with a set of 872 ROMs, the code would have cycled 19,184,000 times. And that's just on an average of 4 words per filename. Some of these filenames might have 9 or 10 words, which becomes increasingly bigger.

The app was so boggy when I tried doing it this way that I had to change it.

What I have implemented as a near fix is to have the program search a word for Upper-cased letters and separate it at that point. (i.e. BomberMan becomes Bomber Man)

I'm also working on a guide to help people with using FatMatch at it's full potential. In this guide, I suggest matching at a high accuracy rate like 80% or higher and then rename the files that you select. You can thereafter run the "List Missing Images" function to move the matched Images and Roms to a temporary separate folder, where-after you can run another Matching search at a lower accuracy like 49%, which would yield you more broad results, but without over-listing possibilities.

Please keep the suggestions coming!

Link to comment
Share on other sites

Please keep the suggestions coming!

No problem, I am happy to help. It's a nice little program that is super easy to use (what I need). I like it.

Another suggestion would be to have all ROMs from your selected folder listed in the tree, and those that are missing images displayed in red. Here's why...

As I collect images, I need a list... something to compare to and look for. I tend to do this on paper. I go through each game, one by one to see what is wrong, and what is missing. When you do this with an NES set, it can take forever.

So if you display a tree like you have, but also include the roms with missing images, I can then keep that open while I surf the net, or check all my folders to see what happend. All you have now is showing what matched.

So you could have a 'found/corrected' list and a 'missing' list for each image type (box, snap, cart...) as you individually check each folder.

All the best!

Link to comment
Share on other sites

Hi Fatune85,

I posted a topic earlier because like many I have problems auditing all of my media collections. Out of that I have some suggestions which you could maybe look at.

  • In the "List Missing Images" you can generate a HAVE and MISSING list. It would be nice if you could also generate an UNMATCHED list which states the files in the images location which did not match to any roms. That way you could view files which may be not needed and can just be deleted. I came across a program Shot Reporter v1.0 which does this but it is very outdated and missing some things like non-case comparing. I like your program but it would be nice if this could be added.
  • It would be awesome if you could also use FatMatch through the command line. That way I could use it in a batch. Maybe something like FatMatch.exe /roms "rompath" /images "imagespath" /lists "listpath". Where listpath is the path you specify where it automatically saves the three have, miss and unmatched lists.
  • Being able to specify paths in the program by simple copy&paste. Now they are greyed out and you have to use the browse button.
  • Being able to select a recursive option with both paths where it will search subdirectories. I saw that function in Shot Reporter mentioned above. Maybe you could take a look if it has other features to your liking. If command line would be implented, giving it the ability to enable there also :P
  • When generating lists it would be nice to get the totals displayed. Like 50/105 meaning you have 50 of 105 roms as example for the have list. Same goes for the miss (and unmatched if implemented) list. If command line would get implemented maybe also set these as variables. like FM-TOTALROMS=105, FM-HAVE=50, FM-MISS=55, UNMATCHED=0.

Sorry for the load of suggestions. Just thinking out loud so to speak as to what I came across until now. :D

Link to comment
Share on other sites

One more thing. With running the match wizard I noticed that in the results it also lists the files which already match 100%. Why is this? Doesn't this just make the list a lot longer then necessary?

Like I was trying to rename boxshots for my SNES collection. I already have 702 png's which match. So I don't need to see these. But they did turn up in the list as 100%. Could this be left out or possible through an option at least?

Link to comment
Share on other sites

  • 3 weeks later...

It seems that Fatmatch can not browse roms within subfolders. There are many platforms like Amiga CD32, 3DO, Sega Dreamcast, Sega Saturn, PS1 etc where I have the roms (bin,iso files) whithin the main folder of each game. Is there a way for FatMatch to search inside those subfolders if you give it the main folder ???

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.

×
×
  • Create New...