I’ve been developing an n-gram based search in PHP for a secondary project at work. For development purposes, I’ve been scoring things in PHP while I tweaked by algorithm, but now that I’m getting closer to the implementation stage of things I’m trying to push the work of scoring onto mySQL.
I was hoping mySQL had some method of counting occurrences of a string within a string, but much to my chagrin it doesn’t have this. Luckily, Andrew Hanna posted a great custom mySQL function that does exactly this. For the sake of posterity and my own reference, I’ve posted his function below.
DROP FUNCTION IF EXISTS substrCount||
CREATE FUNCTION substrCount(s VARCHAR(255), ss VARCHAR(255)) RETURNS TINYINT(3) UNSIGNED LANGUAGE SQL NOT DETERMINISTIC READS SQL DATA
BEGIN
DECLARE count TINYINT(3) UNSIGNED;
DECLARE offset TINYINT(3) UNSIGNED;
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET s = NULL;
SET count = 0;
SET offset = 1;
REPEAT
IF NOT ISNULL(s) AND offset > 0 THEN
SET offset = LOCATE(ss, s, offset);
IF offset > 0 THEN
SET count = count + 1;
SET offset = offset + 1;
END IF;
END IF;
UNTIL ISNULL(s) OR offset = 0 END REPEAT;
RETURN count;
END;
usage is this…
SELECT substrCount('/this/is/a/path', '/') `count`;
Categories
- Awards and Recognition (2)
- C# (2)
- GIS/Maps (1)
- Linux Admin (1)
- mySQL (4)
- Personal (2)
- PHP (6)
- SQL Server (1)
- The Business Side (2)
- Uncategorized (2)
- vbscript (1)
leave a reply
You must be logged in to post a comment.