Posted by paris on Apr 20, 2017 in Fixes
You can make a change to the WorkSpace description (add a full stop to the description or other characters), save it, and then reverse the change, and save it, it to force a WorkSpace to be re-indexed.
This will update the time stamp on the WorkSpace and the Indexer should pick it up to be indexed.
WorkSite Connector service is only looking for the EDITWHEN or EDITPROFILEWHEN column in the MHGROUP.DOCMASTER table
If changing this does not get this Workspace in the indexer it seems your connector might be broken , make sure you update to the latest version ( connector.jar )
If you have a batch of workspaces you can also do this through SQL:
To check if you are running UTC go to the below registry path on your iManage Servers
UTC In Use = N
UTC In Use = Y
If this key does not exist
Run this query in SQL
SELECT 'GETDATE() ', GETDATE();
SELECT 'GETUTCDATE() ', GETUTCDATE();
Check both the times , and update a workspace name ( by adding a full stop at the end and deleting ) and then use a select query to see which time is closer to see if it’s using UTC or not
SELECT *FROM MHGROUP.DOCMASTER DWSJOIN MHGROUP.PROJECTS PWS ON DWS.DOCNUM = PWS.DOCNUMWHERE PWS.PRJ_NAME LIKE '%name of workspace%'
Check the PRJ_NAME matches your WorkSpace Description.
If you are UTC In Use = N, ( not ) use the below.
To update the WorkSpace for re-indexing:
UPDATE MHGROUP.DOCMASTERSET EDITPROFILEWHEN = GETDATE()FROM MHGROUP.DOCMASTER DWSJOIN MHGROUP.PROJECTS PWS ON DWS.DOCNUM = PWS.DOCNUMWHERE PWS.PRJ_NAME LIKE '%name of workspace%'
If you are UTC In Use = Y, ( Yes ) use the below.
To update the WorkSpace for re-indexing:
UPDATE MHGROUP.DOCMASTERSET EDITPROFILEWHEN = GETUTCDATE()FROM MHGROUP.DOCMASTER DWSJOIN MHGROUP.PROJECTS PWS ON DWS.DOCNUM = PWS.DOCNUMWHERE PWS.PRJ_NAME LIKE '%name of workspace%'
Posted by paris on Jan 20, 2015 in Fixes
After an Indexer Server Veeam restore, the Worksite Connector/Crawler started but no log file updated, which meant it was in a stale state.
Stop the Worksite Connector Service
Go to the directory : Worksite Connector\actions\fetch
Delete fetch.queue file
Start the service back up
Check to make sure the log file location in Worksite Connector\logs\worksiteCrawler.log is updating
You might want to check the ingestion service log is also updating or you might need to clear the ingest.queue
If below does not work this might need to be done
- Stop all services.
- Ran _cleanup scripts from E:\Indexer\WorkSite Connector\ & E:\Indexer\WorkSite Ingestion Server\
- Changed the time in this file E:\Indexer\WorkSite Connector\connector_DBNAME_datastore.db back to “m/d/yy” the date when the indexer stopped working. Please note this time is in “epoch” so you will need a timestamp converter.
- Started services.
To stop and start the services, there are 2 methods:
- Run the script in “IndexerInstallDrive\Indexer\_stop_services”, “IndexerInstallDrive\Indexer\_start_services”
- Log into the Autonomy Control Center “http://IndexerServerName:8080/controlCenter”. Default username and password – admin/admin. Once logged in you can stop and start all services in the GUI.
Essentially when indexing, data flows in the following order:
- The WorkSite Connector service to crawl WorkSite databases servers.
- The WorkSite Ingestion service, which converts all files to the IDX format.
- The WorkSite Content service, which indexes all of the content and serves search requests.
When troubleshooting indexer issues, the logs are key to identifying issues. I recommend you use ‘BareTail’ as this allows you to view the many logs as well as the changes to them, each component has at least 5 types of logs.
Connector: componentinstalldisk:\Indexer\WorkSite Connector\logs
Ingestion: componentinstalldisk:\Indexer\WorkSite Ingestion Server\logs
Content: componentinstalldisk:\Indexer\WorkSite Content\logs
Here is a nice write up from Autonomy from when the client wants a better understanding about timing with the indexer:
1. Customer drags a document into WorkSite (~1-2 seconds, very minimal)
2. WorkSite Crawler crawls searching for new/updated documents during interval of time between each crawl. (takes a 1 minute rest between crawls)
3. Document is moved through Ingestion (~1-2 seconds, very minimal)
4. Document is moved through Active DIH (~1-2 seconds, very minimal)
5. Document is moved to Active Content, where every 15 seconds Active Content writes the data to disk making it available for search.
So the time for a new document to be indexed is a lot based on where the Connector and Active Content is on it’s interval. Potentially you could see a 1 minute 30 second delay, or potentially only a 30 second delay before searchable.