SSEP - Site Search Engine PHP-Ajax

This is a completly Free and Open Source Site Search engine script that uses MySQL to store your website's indexed pages, to add Search Functionality to Your Web Site. It is build with PHP and JavaScript (search results are loaded via Ajax).
The search system combine MySQL full text with SQL regexp, and words weight according to their location in HTML elements, to determine the relevance of the search results.

Version 1.9

- To check for new version, visit the page: coursesweb.net/php-mysql/ssep-site-search-engine-php-ajax_s2

Features

- Intuitive and easy to use Admin Panel, with a simple adminstration interface, and info mark description to each function.
- Suports both PDO and MySQLi for accessing MySQL databases in PHP.
- Crawl and index web site pages automatically (can follow redirects).
- Option to Include Subdomains.
- Options to control indexed URLs: by link's Depth, by Maximum number of URLs to crawl, by URL Must-Include, or Must-Exclude "strings".
- Crawl and index the links in the XML Sitemap.
- You can register to Crawl and Index multiple domains.
- Stop words excluded from searches.
- Option to remove parts of the page / HTML elements from being indexed.
- Keeps in the indexed content the text added in the "alt" attribute of the <img> tags (which are outside the removed parts).
- Option to Build XML Sitemap with the indexed pages.
- Posibility to Crawl and index domain automatically with Cron Jobs.
- Easy to translate in other languages.
- It works with characters with diacritics.
- The Search results are loaded via Ajax (without refreshing the search page). This option can also be disabled.
- Paginated serch results.
- Option to choose Infinite or Standard pagination.
- Search Suggestions.
- List with last and top searches.
- The search results are ordered by a Score calculated according to the HTML elements in which the searched word is located (Title, Description, H1, Strong, ... and other tags, eaven the URL page address).
- Cache files system for the search results.
- Search Page with valid HTML5 format, and Responsive design (working on Mobile Device too).
- CSS and HTML template easy to customize it, to add new elements in search page, and to change the design.

Requirements

- PHP 5.4+ (with cUrl enabled)
- MySQL 5.2+
- Modern Browser with JavaScript enabled (Mozilla-Firefox, Google-Chrome, Opera, Internet-Explorer 9+).

Installation

  1. Open the "config.php" file to edit it (in "ssep/php/" folder), and add your data for Name and Password to $admin_name and $admin_pass variables. They are used to logg in the SSEP Admin Panel.
  2. Edit the following data, for connecting to MySQL.
    $mysql['host'] = 'localhost';            - replace localhost with your MySQL server address.
    $mysql['user'] = 'root';                 - replace root with your database user name.
    $mysql['pass'] = 'passdb';               - replace passdb with your password for MySQL database.
    $mysql['bdname'] = 'dbname';             - replace dbname with the name of your MySQL database.
    
  3. Copy the "ssep/" directory on your server, in the Root folder of your website ("www/", "htdocs/" or "public_html/").
  4. Set CHMOD 0755 (or 0777) to "cache/" folder on your server (it is used to store cache files with the search results).
  5. Access the "ssep/admin.php" file in your browser, with the address from server; for example; http://localhost/ssep/admin.php
  6. Logg-in with your Name and Password set in "config.php" (in $admin_name and $admin_pass variables), and see the description from the info-mark Info Mark associated to each option in Admin Panel.
  7. Add the following HTML code in the pages of your web site in which you want to include the search form.
    Click on the code to select it.
    <form action="/ssep/index.php" method="post">
      <input type="text" name="sr" maxlength="45" />
      <input type="submit" value="Search" />
    </form>
    
    - The address from "action" must open the "ssep/index.php" file.
• The SSEP script will register automatically the current domain in database, and creates the needed tables.

Cron Jobs Usage

• If you want to automatically index the web pages of a domain registered in SSEP search engine, access the "ssep_cron.php" file with Cron Jobs (from hosting CPanel), passing the "cron=ADMIN_NAME" for $_GET variable (ADMIN_NAME is the name added to the $admin_name variable set in "config.php"). See also the comments in the "ssep_cron.php" file.

Info for the oher configurations in "config.php"

Adding Search Suggestions

The Search Page has included Search-Suggestions feature (The suggestions are Titles of indexed pages which contain the typed words in search form). By default, the SSEP script displays 10 rows with suggestions. This number can be changed in the "Advanced Settings" section, in Admin Panel. To disable this feature, just set to 0 the value of "Search Suggestions" field in Admin Panel.
• If you want to have a form with Search-Suggestions in the other pages of your website, add this code in the site pages (a HTML form, and JavaScript with Ajax for suggestions):
Click on the code to select it.
<form action="/ssep/index.php" method="post" id="search">
  <input type="search" name="sr" autocomplete="off" maxlength="45" id="ssep_inp" pattern="[A-z0-9\u00C0-\u00FF_ \-]{3,45}" required="required"
  title="Between 3 and 45 characters: Letters, Numbers, Lines, and Space." placeholder="Search" />
  <input type="submit" value="Search" />
</form>
<script>
// SSEP - Search Suggestions - from: https://coursesweb.net/ 
// Ajax - Seceives data to send, and a callback function (called when the response is received)
function ajaxSend(datasend, callback) {
  var request =  (window.XMLHttpRequest) ? new XMLHttpRequest() : new ActiveXObject("Microsoft.XMLHTTP");
  datasend += '&isajax=1';    // to know in php it is ajax request

  request.open('POST', '/ssep/index.php');			// define the request

  // adds  a header to tell the PHP script to recognize the data as is sent via POST, and send data
  request.setRequestHeader('Content-type', 'application/x-www-form-urlencoded');
  request.send(datasend);

  // Check request status,  when the response is completely received, pass it to callback function
  request.onreadystatechange = function() {
    if (request.readyState == 4) {
      callback(request.responseText);
    }
  }
}

// keyup event on #search
if(document.getElementById('search')) {
  var src_frm = document.getElementById('search');    // form for search
  if(!document.getElementById('src_sugest')) {
    src_frm.insertAdjacentHTML('beforeend', '<div id="src_sugest"></div>');
    var src_sugest = document.getElementById('src_sugest');    // element for search-suggest
  }
  var cache_sugest = {};    // keep 1st 11 returned sugested
  var sugest_src = [];    // store the 'src' keys of sugested in $cache_sugest

  // get string value, if 3+ characters, removes non alpha-numeric-line-space characters
  // call ajax with the string. Add response in Div #src_sugest
  function srcSugest(src) {
    src = src.replace(/([^A-z0-9\u00C0-\u00FF ])/ig, ' ').replace(/( [A-z0-9\u00C0-\u00FF]{1,2} )|(^[A-z0-9\u00C0-\u00FF]{1,2} )|( [A-z0-9\u00C0-\u00FF]{1,2}$)/ig, ' ').replace(/\s\s+/ig, ' ').replace(/^\s+|\s+$/g, '').toLowerCase();

    if(src.length > 2) {
      // if sugested in cache, add it, else, get via ajax
      if(cache_sugest[src]) src_sugest.innerHTML = cache_sugest[src] +'<div onclick="this.parentNode.innerHTML = \'\';">X</div>';
      else {
        ajaxSend('sugest='+ src, function(resp){
          if(resp.length > 8) {
            if(src_sugest) src_sugest.innerHTML = resp +'<div onclick="this.parentNode.innerHTML = \'\';">X</div>';

            // store sugested in $cache_sugest, keeping 15 caches (delete $src from $sugest_src, and $cache_sugest)
            if(sugest_src.length > 15) delete cache_sugest[sugest_src.shift()];
            cache_sugest[src] = resp;
            sugest_src.push(src);
          }
        });
      }
    }
    else if(src_sugest) src_sugest.innerHTML = '';
  }

  src_frm['sr'].removeEventListener('keyup', function(e){srcSugest(e.target.value);}, false);
  src_frm['sr'].addEventListener('keyup', function(e){srcSugest(e.target.value);}, false);

  // called onclick a sugested title. Get and set search phrase
  function getSugest(src_t) {
    src_sugest.innerHTML = '';
    src_frm['sr'].value = src_t.innerHTML.replace(/\<[^\>]*\>/ig, '');    // delete tags
    src_frm.submit();
  }
}
</script>
- The address from "action" in <form> must open the "ssep/index.php" file.

• And add this CSS code into an external .css file or <style> tag included in the web page (it styles the suggestions).
Click on the code to select it.
#search {
 position:relative;
 padding:0;
}
#search #ssep_inp:focus {
 background:#eeeefe;
}
#src_sugest {
 position:absolute;
 left:-.5em;
 margin:1px 0 0 2px;
 max-width:15em;
 max-height:30em;
 background:#eee;
 text-align:left;
 padding:0;
 font-size:.9em;
 font-family:"Calibri",sans-serif;
 z-index:9990;
 overflow-Y:auto;
 overflow-X:hidden;
 -moz-border-radius:.5em;
 -webkit-border-radius:.5em;
 -khtml-border-radius:.5em;
 border-radius:.5em;
}
#src_sugest h4 {
 margin:3px 2px;
 border-bottom:1px dashed #0001bb;
 padding:1px;
 font-weight:600;
 cursor:pointer;
}
#src_sugest h4:hover {
 background:#fefefe;
}
#src_sugest .hglw {
 background:#fbfbbb;
 font-weight:700;
 font-style:oblique;
}
#src_sugest div {
 position:absolute;
 top:0;  right:0;
 border:1px solid #fe0000;
 background:#fbfbfb;
 padding:1px 3px;
 font-size:1em;
 font-weight:700;
 color:#fb0001;
 cursor:pointer;
 -moz-border-radius:.4em;
 -webkit-border-radius:.4em;
 -khtml-border-radius:.4em;
 border-radius:.4em;
}
#src_sugest div:hover {
 background:#fbfb00;
}
- See example in the "example-search-form-sugest.htm" file.

Other Specifications

• The SSEP script uses by default PDO for connecting to MySQL database. If your server not support PDO, the script will use MySQLi.

• The tables in MySQL database are created automatically with the createMainTables() and createIndexTables() methods, in the "php/crawlindex.php" file.
- If the script not creates automatically the tables, you can create the main tables by accessing this address on your browser (after logg in as Admin):
http://your_domain/ssep/admin.php?mod=create_tables

• These type of files (extensions) are excluded from crawling:

3g2|3gp|7z|a52|aac|ace|amv|ar|arc|arj|as|asc|asf|avi|bin|bmp|bz2|bzip|bzip2|css|csv|divx|dll|drc|dv|f4v| exe|fla|flv|gif|gvi|gxf|gz|gzip|ice|ico|inf|ini|iso|jar|jpg|jpe|jpeg|js|jsfl|json|log|kar|m1v|m2v|m4a|m4v| midi|mkv|mp1|mp2|mp3|mp4|mtv|mxf|odb|odf|odp|ogg|ogm|ogv|ogx|ott|pcd|pdf|pic|pgm|png|pps|ppt| psd|ram|rar|rle|rm|sgv|sql|swf|tar|tga|tif|tiff|ttf|vlc|wmf|wmv|wvx|xlt|xar|xfl|zip|zipx

- If you want to modify this list, edit the $exclude_files property, in the "php/crawlindex.php" file (line 22). Use the "|" as separator.

• The "stop_words.txt" file (in the "ssep/php/" folder) contains stop word which will be excluded from searches. You can add other stop words too, separated by comma.

• You can Add to Crawl and Index multiple domains, BUT the Search Page can be used to search in a single domain.
- The SSEP script crawls and indexes only the local links, that points to pages of the current selected domain in Admin Panel.

• By default, the script removes <a>, <form>, and <select> HTML elements (with all their content) from indexed contend stored in database, for faster and better search reults (However the local links in crawled page are followed). You can delete them from excluded list (by clicking on the [X] button), or also add other HTML elements to be removed, with specified attributes, in the "Advanced Settings" section, in Admin Panel
For example, if you want to exclude all the <ul> items with class "menu": Add UL to the Tag Name column, class to the Attribute column, and menu to the Value column.
- You can specify multiples values for the same attribute, separated by comma.
For example, to remove the <div> items which have id="some_name", id="side", and id="footer": Add DIV to the "Tag Name" column, id to "Attribute", and: some_name, side, footer to "Value".
To remove all the <pre> elements, just add PRE to "Tag Name", and let empty the "Attribute" and "Value" fields.

• Each registered domain can be cofigured with its own configuration settings.

• By default, the search results in the search page are loaded via Ajax, with Infinite Pagination technology. When the scrollbar reaches near the bottom of the page, the next page is loaded, and again, till the last pagination page with results of that search.
You can replace Infinite Pagination with Classic pagination, in the "Advanced Settings" section. The pagination links will be displayed to the bottom, after the searh results.
- If you want to Not use Ajax, check the Disable button in "Advanced Settings".
- After you make changes in "Advanced Settings" delete the cache files (from the button: Delete Cache files).

For other details, see the description from info-mark Info Mark associated to each option.

• To make changes in the HTML of the Search Page, edit the "search.htm" file, in the "ssep/templ/" folder.
• To change the style of the Search Page, edit the "search_style.css" and "search_style_mobile.css" files, in the "ssep/templ/" folder ("search_style_mobile.css" is for browsers with the width less than 400 pixels, for mobile devices).


• To see online Demo of this SSEP aplication, visit: coursesweb.net/scripts/ssep/admin.php

• Home Page: coursesweb.net/php-mysql/ssep-site-search-engine-php-ajax_s2


- This script is Free, and Open Source. You can use, modify and publish it freely.
- Have a Happy Life with Everyone -