FUSION Scheduler to Import using Delta Query

Brief Description - 

1. Create a datasource with delta sql query. Delta queries provide incremental updates to the contents of a collection by indexing only those records in the database which have been changed since the database was last indexed by this connector.

2. Add a scheduler with required interval polling, and point this scheduler to the above datasource. 


Detailed steps -
Here, I'm going to create a JDBC Datasource that will import records from table "Persons". Adding a scheduler that will poll this data source every 1 minute to do the incremental fetch. 


Steps - 

Database related

1. Starting the MySQL service.
2. The existing database in MySQL is "FirstMySqlDatabase" and the table in it is "Persons". Refer to image_1.png

Fusion Datasource creation

1.Create a collection by name "Persons" on Fusion.
2. Click on the collection, navigate to DataSource and selecting JDBC datasource.
3. Upload the mysql-jdbc-jar. Enter the other configuration details such as Datasource ID, URL, select driver, user name, password, and delta sql query that you need it triggered every time. Please find the attached image_2 for reference.

The details I filled are as follows -
i) Datasource ID - jdbc_test_ds_id
ii) Pipeline ID - Default_Data
iii) URL - jdbc:mysql://localhost/FirstMySqlDatabase
iv) Driver - com.mysql.jdbc.Driver
v) Username - root
vi) Password - ****
vii) SQL Select Statement - select * from Persons where date > $
viii) Save
viii) Run the job and check if it fetched the records.

If you have questions related to what is $, please have a look at this link[3]


Fusion Scheduler Configuration

1. On fusion admin page,  go to "Applications",  click on "Scheduler"
It will navigate to URL  http://<host>:8764/scheduler

2. Click on "Add a Schedule" and enter the config details such as scheduler name, service name, select the interval and the time when job should start to run.Please find the attached image_3 for reference.

The details I filled are as follows - 
i) Scheduler Name - "DeltaFetchJob"
ii)  Selecting the "service" from Service drop down. The service end point in this case is "connectors/jobs/jdbc_test_ds_id"  [ connector with its datasource id that we created above"jdbc_test_ds_id"]
iii) Method "POST"
iv) Start time that is picked by default [From now]
v) Uncheck the "Run Once" (Because we want it to run every one minute)
vi) Increment interval to 1. Select "MINUTE"
vii) Check on Active (if it is uncheck state)
viii) Save. (image_4 for reference)
ix) After saving, search by filter on the same page, and you should be able to see this scheduler listed and showing the next scheduled time (image_5 for reference).
x) Verify whether it fetched the updated record after a minute.

 

[1]https://doc.lucidworks.com/fusion/2.1/Connectors-and-Datasources.html
[2]https://doc.lucidworks.com/fusion/2.1/System_Administration/Schedules.html
[3] https://doc.lucidworks.com/fusion/2.1/Connectors_and_Datasources_Reference/JDBC-Connector-and-Datasource-Configuration.html

 

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.
Powered by Zendesk