Feed aggregator

Case Study

Jonathan Lewis - Thu, 2022-09-29 12:27

A recent question on the Oracle Developer Community forum asked for help with a statement that was taking a long time to run. The thread included the results from a trace file that had been passed through tkprof so we have the query and the actual execution plan with some rowsource execution stats.

Here’s the query – extracted from the tkprof output:

SELECT DISTINCT
       pll.po_line_id,
       ploc.line_location_id,
       (SELECT ptl.line_type
          FROM apps.po_line_types_tl ptl
         WHERE ptl.line_type_id = pll.line_type_id AND ptl.LANGUAGE = 'US')
           "Line_Type",
       ploc.quantity_accepted,
       NULL
           release_approved_date,
       NULL
           release_date,
       NULL
           release_hold_flag,
       NULL
           release_type,
       DECODE (ploc.po_release_id, NULL, NULL, ploc.quantity)
           released_quantity,
       (SELECT items.preprocessing_lead_time
          FROM apps.mtl_system_items_b items
         WHERE     items.inventory_item_id = pll.item_id
               AND items.organization_id = ploc.SHIP_TO_ORGANIZATION_ID)
           "PreProcessing_LT",
       (SELECT items.full_lead_time
          FROM apps.mtl_system_items_b items
         WHERE     items.inventory_item_id = pll.item_id
               AND items.organization_id = ploc.SHIP_TO_ORGANIZATION_ID)
           "Processing_LT",
       (SELECT items.postprocessing_lead_time
          FROM apps.mtl_system_items_b items
         WHERE     items.inventory_item_id = pll.item_id
               AND items.organization_id = ploc.SHIP_TO_ORGANIZATION_ID)
           "PostProcessing_LT",
       ploc.firm_status_lookup_code,
       NVL (
           (SELECT pla.promised_date
              FROM apps.po_line_locations_archive_all pla
             WHERE     pla.po_header_id = pha.po_header_id
                   AND pla.po_line_id = pll.po_line_id
                   AND pla.line_location_id = ploc.line_location_id
                   AND pla.revision_num =
                       (SELECT MIN (revision_num)
                          FROM apps.po_line_locations_archive_all plla2
                         WHERE     plla2.promised_date IS NOT NULL
                               AND plla2.line_location_id =
                                   ploc.line_location_id)),
           ploc.promised_date)
           "Original_Promise_Date",
       (SELECT items.long_description
          FROM apps.mtl_system_items_tl items
         WHERE     items.inventory_item_id = pll.item_id
               AND items.organization_id IN
                       (SELECT fin.inventory_organization_id
                          FROM apps.financials_system_params_all fin
                         WHERE fin.org_id = pha.org_id)
               AND items.LANGUAGE = 'US')
           "Item_Long_Description",
       NVL (ploc.approved_flag, 'N')
           approved_code,
       pvs.country
           "Supplier_Site_Country",
       pll.note_to_vendor,
         NVL (ploc.quantity, 0)
       - NVL (ploc.quantity_cancelled, 0)
       - NVL (ploc.quantity_received, 0) * ploc.price_override
           "Shipment_Amount",
       ploc.attribute4
           "PO_Ship_Date",
       (SELECT meaning
          FROM apps.fnd_lookup_values
         WHERE     lookup_type = 'SHIP_METHOD'
               AND lookup_code = ploc.attribute9
               AND language = 'US')
           "Ship_Method",
       (SELECT prla.note_to_receiver
          FROM apps.po_req_distributions_all  prda
               INNER JOIN apps.po_requisition_lines_all prla
                   ON prda.requisition_line_id = prla.requisition_line_id
         WHERE prda.distribution_id = pdi.req_distribution_id)
           "Note_To_Receiver",
       DECODE (pha.USER_HOLD_FLAG, 'Y', 'Y', pll.USER_HOLD_FLAG)
           "Hold_Flag",
       (SELECT ABC_CLASS_NAME
          FROM APPS.MTL_ABC_ASSIGNMENT_GROUPS  ASG
               INNER JOIN APPS.MTL_ABC_ASSIGNMENTS ASSI
                   ON ASG.ASSIGNMENT_GROUP_ID = ASSI.ASSIGNMENT_GROUP_ID
               INNER JOIN APPS.MTL_ABC_CLASSES classes
                   ON ASSI.ABC_CLASS_ID = classes.ABC_CLASS_ID
         WHERE     ASG.organization_id = ploc.SHIP_TO_ORGANIZATION_ID
               AND ASG.ASSIGNMENT_GROUP_NAME = 'MIN ABC Assignment'
               AND ASSI.inventory_item_id = pll.item_id)
           ABCClass,
       (SELECT CONCATENATED_SEGMENTS AS charge_accountsfrom
          FROM apps.gl_code_combinations_kfv gcc
         WHERE gcc.code_combination_id = pdi.code_combination_id)
           AS charge_accounts
  FROM apps.po_headers_all         pha,
       apps.po_lines_all           pll,
       apps.po_line_locations_all  ploc,
       apps.po_distributions_all   pdi,
       apps.per_all_people_f       papf,
       apps.AP_SUPPLIERS           pv,
       apps.AP_SUPPLIER_SITES_ALL  pvs,
       apps.AP_SUPPLIER_CONTACTS   pvc,
       apps.ap_terms               apt,
       apps.po_lookup_codes        plc1,
       apps.po_lookup_codes        plc2,
       apps.hr_locations           hlv_line_ship_to,
       apps.hr_locations           hlv_ship_to,
       apps.hr_locations           hlv_bill_to,
       apps.hr_organization_units  hou,
       apps.hr_locations_no_join   loc,
       apps.hr_locations_all_tl    hrl1,
       apps.hr_locations_all_tl    hrl2
 WHERE     1 = 1
       AND pll.po_header_id(+) = pha.po_header_id
       AND ploc.po_line_id(+) = pll.po_line_id
       AND pdi.line_location_id(+) = ploc.line_location_id
       AND ploc.shipment_type IN ('STANDARD', 'PLANNED')
       AND papf.person_id(+) = pha.agent_id
       AND TRUNC (SYSDATE) BETWEEN papf.effective_start_date
                               AND papf.effective_end_date
       AND papf.employee_number IS NOT NULL
       AND pv.vendor_id(+) = pha.vendor_id
       AND pvs.vendor_site_id(+) = pha.vendor_site_id
       AND pvc.vendor_contact_id(+) = pha.vendor_contact_id
       AND apt.term_id(+) = pha.terms_id
       AND plc1.lookup_code(+) = pha.fob_lookup_code
       AND plc1.lookup_type(+) = 'FOB'
       AND plc2.lookup_code(+) = pha.freight_terms_lookup_code
       AND plc2.lookup_type(+) = 'FREIGHT TERMS'
       AND hlv_line_ship_to.location_id(+) = ploc.ship_to_location_id
       AND hlv_ship_to.location_id(+) = pha.ship_to_location_id
       AND hlv_bill_to.location_id(+) = pha.bill_to_location_id
       AND hou.organization_id = pha.org_id
       AND hou.location_id = loc.location_id(+)
       AND hrl1.location_id(+) = pha.ship_to_location_id
       AND hrl1.LANGUAGE(+) = 'US'
       AND hrl2.location_id(+) = pha.bill_to_location_id
       AND hrl2.LANGUAGE(+) = 'US'
       AND hou.organization_id IN (2763)
       AND NVL (pha.closed_code, 'OPEN') IN ('OPEN', 'CLOSED')
       AND NVL (pll.closed_code, 'OPEN') IN ('OPEN', 'CLOSED')
       AND NVL (ploc.cancel_flag, 'N') = 'N'
       AND pha.authorization_status IN
               ('APPROVED', 'REQUIRES REAPPROVAL', 'IN PROCESS')

As you can see there are 10 inline scalar subqueries (highlighted) in the query with a select distinct to finish off the processing of an 18 table join. That’s a lot of scalar subqueries so it’s worth asking whether the code should be rewritten to use joins (though in newer vesions of Oracle some of the subqueries might be transformed to outer joins anyway). We also know that a distinct may be a hint that there’s a possible logic error that has been “fixed” by eliminating duplicates.

Ignoring those points, let’s consider the execution plan from the tkprof output which (with a tiny bit of extra formatting) is as follows:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.46       1.75          0          3          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    50346    279.02    1059.39     179103   30146895          0      755164
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    50348    279.49    1061.14     179103   30146898          0      755164

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 678  
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         9          9          9  TABLE ACCESS BY INDEX ROWID PO_LINE_TYPES_TL (cr=20 pr=0 pw=0 time=680 us cost=2 size=32 card=1)
         9          9          9   INDEX UNIQUE SCAN PO_LINE_TYPES_TL_U1 (cr=11 pr=0 pw=0 time=323 us cost=1 size=0 card=1)(object id 63682480)

    576365     576365     576365  TABLE ACCESS BY INDEX ROWID MTL_SYSTEM_ITEMS_B (cr=2267756 pr=28 pw=0 time=22598079 us cost=4 size=13 card=1)
    576365     576365     576365   INDEX UNIQUE SCAN MTL_SYSTEM_ITEMS_B_U1 (cr=1720936 pr=0 pw=0 time=4644552 us cost=3 size=0 card=1)(object id 42812859)

    576365     576365     576365  TABLE ACCESS BY INDEX ROWID MTL_SYSTEM_ITEMS_B (cr=2267747 pr=0 pw=0 time=2442479 us cost=4 size=13 card=1)
    576365     576365     576365   INDEX UNIQUE SCAN MTL_SYSTEM_ITEMS_B_U1 (cr=1720936 pr=0 pw=0 time=1238342 us cost=3 size=0 card=1)(object id 42812859)

    576365     576365     576365  TABLE ACCESS BY INDEX ROWID MTL_SYSTEM_ITEMS_B (cr=2267743 pr=0 pw=0 time=2029190 us cost=4 size=14 card=1)
    576365     576365     576365   INDEX UNIQUE SCAN MTL_SYSTEM_ITEMS_B_U1 (cr=1720932 pr=0 pw=0 time=967729 us cost=3 size=0 card=1)(object id 42812859)

    672743     672743     672743  TABLE ACCESS BY INDEX ROWID PO_LINE_LOCATIONS_ARCHIVE_ALL (cr=5507736 pr=163043 pw=0 time=535914552 us cost=3 size=27 card=1)
    672743     672743     672743   INDEX UNIQUE SCAN PO_LINE_LOCATIONS_ARCHIVE_U1 (cr=4560824 pr=163043 pw=0 time=533161038 us cost=2 size=0 card=1)(object id 42811947)
    755121     755121     755121    SORT AGGREGATE (cr=3540960 pr=163043 pw=0 time=530079821 us)
   1040963    1040963    1040963     TABLE ACCESS BY INDEX ROWID PO_LINE_LOCATIONS_ARCHIVE_ALL (cr=3540960 pr=163043 pw=0 time=534243973 us cost=5 size=15 card=1)
   1776649    1776649    1776649      INDEX RANGE SCAN PO_LINE_LOCATIONS_ARCHIVE_U1 (cr=1123074 pr=6392 pw=0 time=37128373 us cost=3 size=0 card=2)(object id 42811947)

    587486     587486     587486  TABLE ACCESS BY INDEX ROWID MTL_SYSTEM_ITEMS_TL (cr=3436629 pr=3564 pw=0 time=64125044 us cost=5 size=34 card=1)
    587486     587486     587486   INDEX RANGE SCAN MTL_SYSTEM_ITEMS_TL_U1 (cr=2852930 pr=869 pw=0 time=45628505 us cost=4 size=0 card=1)(object id 136492495)
         1          1          1    TABLE ACCESS BY INDEX ROWID FINANCIALS_SYSTEM_PARAMS_ALL (cr=645351 pr=0 pw=0 time=5743158 us cost=2 size=10 card=1)
    322268     322268     322268     INDEX SKIP SCAN FINANCIALS_SYSTEM_PARAMS_U1 (cr=323083 pr=0 pw=0 time=5104895 us cost=1 size=0 card=1)(object id 42770563)

        10         10         10  TABLE ACCESS BY INDEX ROWID FND_LOOKUP_VALUES (cr=51 pr=1 pw=0 time=3620 us cost=5 size=60 card=1)
        20         20         20   INDEX RANGE SCAN FND_LOOKUP_VALUES_X99 (cr=31 pr=1 pw=0 time=2133 us cost=4 size=0 card=1)(object id 42759866)

    634276     634276     634276  NESTED LOOPS  (cr=3540930 pr=5535 pw=0 time=181518759 us cost=5 size=28 card=1)
    634276     634276     634276   TABLE ACCESS BY INDEX ROWID PO_REQ_DISTRIBUTIONS_ALL (cr=1631471 pr=5253 pw=0 time=65405333 us cost=3 size=12 card=1)
    634276     634276     634276    INDEX UNIQUE SCAN PO_REQ_DISTRIBUTIONS_U1 (cr=994522 pr=5252 pw=0 time=31023194 us cost=2 size=0 card=1)(object id 42788583)
    634276     634276     634276   TABLE ACCESS BY INDEX ROWID PO_REQUISITION_LINES_ALL (cr=1909459 pr=282 pw=0 time=115275921 us cost=2 size=16 card=1)
    634276     634276     634276    INDEX UNIQUE SCAN PO_REQUISITION_LINES_U1 (cr=944449 pr=268 pw=0 time=12285440 us cost=1 size=0 card=1)(object id 42789681)

    511989     511989     511989  NESTED LOOPS  (cr=3533763 pr=6 pw=0 time=8999321 us cost=5 size=55 card=1)
    511989     511989     511989   NESTED LOOPS  (cr=2850293 pr=6 pw=0 time=7086027 us cost=4 size=45 card=1)
    576055     576055     576055    TABLE ACCESS BY INDEX ROWID MTL_ABC_ASSIGNMENT_GROUPS (cr=612378 pr=0 pw=0 time=2002832 us cost=2 size=29 card=1)
    576055     576055     576055     INDEX UNIQUE SCAN MTL_ABC_ASSIGNMENT_GROUPS_U2 (cr=36323 pr=0 pw=0 time=951307 us cost=1 size=0 card=1)(object id 42783622)
    511989     511989     511989    TABLE ACCESS BY INDEX ROWID MTL_ABC_ASSIGNMENTS (cr=2237915 pr=6 pw=0 time=4672006 us cost=3 size=16 card=1)
    511989     511989     511989     INDEX UNIQUE SCAN MTL_ABC_ASSIGNMENTS_U1 (cr=1551490 pr=4 pw=0 time=2533524 us cost=2 size=0 card=1)(object id 42757737)
    511989     511989     511989   TABLE ACCESS BY INDEX ROWID MTL_ABC_CLASSES (cr=683470 pr=0 pw=0 time=1488045 us cost=1 size=10 card=1)
    511989     511989     511989    INDEX UNIQUE SCAN MTL_ABC_CLASSES_U1 (cr=171481 pr=0 pw=0 time=693745 us cost=0 size=0 card=1)(object id 42789694)

     13320      13320      13320  TABLE ACCESS BY INDEX ROWID GL_CODE_COMBINATIONS (cr=34801 pr=0 pw=0 time=802675 us cost=3 size=49 card=1)
     13320      13320      13320   INDEX UNIQUE SCAN GL_CODE_COMBINATIONS_U1 (cr=21481 pr=0 pw=0 time=397344 us cost=2 size=0 card=1)(object id 42775044)


    755164     755164     755164  HASH UNIQUE (cr=30147018 pr=179103 pw=0 time=1058922684 us cost=749257 size=197349453 card=482517)
    768890     768890     768890   HASH JOIN  (cr=7289842 pr=6926 pw=0 time=244582512 us cost=696202 size=197349453 card=482517)
    140451     140451     140451    TABLE ACCESS FULL PER_ALL_PEOPLE_F (cr=38207 pr=0 pw=0 time=313692 us cost=18484 size=13278261 card=428331)
    768890     768890     768890    NESTED LOOPS OUTER (cr=7251635 pr=6926 pw=0 time=242897348 us cost=672652 size=30016980 card=79410)
    755121     755121     755121     NESTED LOOPS OUTER (cr=5538283 pr=6031 pw=0 time=154841427 us cost=443987 size=28382903 card=78623)
    755121     755121     755121      NESTED LOOPS OUTER (cr=5508916 pr=6031 pw=0 time=153523676 us cost=443982 size=18184959 card=51809)
    755121     755121     755121       NESTED LOOPS OUTER (cr=5386279 pr=6031 pw=0 time=151985656 us cost=443978 size=11642422 card=34142)
    755121     755121     755121        NESTED LOOPS  (cr=5090949 pr=6031 pw=0 time=139220421 us cost=375644 size=11574138 card=34142)
    792959     792959     792959         NESTED LOOPS  (cr=1747964 pr=134 pw=0 time=64597738 us cost=109035 size=19934760 card=73560)
    254919     254919     254919          HASH JOIN OUTER (cr=315780 pr=6 pw=0 time=14811187 us cost=29121 size=5413350 card=22650)
    254919     254919     254919           NESTED LOOPS OUTER (cr=286919 pr=0 pw=0 time=12395919 us cost=13792 size=5209500 card=22650)
    254919     254919     254919            HASH JOIN RIGHT OUTER (cr=107134 pr=0 pw=0 time=12153146 us cost=13790 size=3868572 card=17426)
      3834       3834       3834             VIEW  HR_LOCATIONS (cr=3913 pr=0 pw=0 time=15826 us cost=125 size=360 card=60)
      3834       3834       3834              NESTED LOOPS  (cr=3913 pr=0 pw=0 time=15055 us cost=125 size=1080 card=60)
      3834       3834       3834               TABLE ACCESS FULL HR_LOCATIONS_ALL (cr=262 pr=0 pw=0 time=11211 us cost=125 size=304 card=38)
      3834       3834       3834               INDEX UNIQUE SCAN HR_LOCATIONS_ALL_TL_PK (cr=3651 pr=0 pw=0 time=6183 us cost=0 size=20 card=2)(object id 42783719)
    254919     254919     254919             HASH JOIN RIGHT OUTER (cr=103221 pr=0 pw=0 time=11917174 us cost=13666 size=3764016 card=17426)
      3834       3834       3834              VIEW  HR_LOCATIONS (cr=3898 pr=0 pw=0 time=14651 us cost=125 size=360 card=60)
      3834       3834       3834               NESTED LOOPS  (cr=3898 pr=0 pw=0 time=14267 us cost=125 size=1080 card=60)
      3834       3834       3834                TABLE ACCESS FULL HR_LOCATIONS_ALL (cr=247 pr=0 pw=0 time=9532 us cost=125 size=304 card=38)
      3834       3834       3834                INDEX UNIQUE SCAN HR_LOCATIONS_ALL_TL_PK (cr=3651 pr=0 pw=0 time=9539 us cost=0 size=20 card=2)(object id 42783719)
    254919     254919     254919              HASH JOIN RIGHT OUTER (cr=99323 pr=0 pw=0 time=11817243 us cost=13541 size=3659460 card=17426)
        45         45         45               INDEX RANGE SCAN FND_LOOKUP_VALUES_U1 (cr=21 pr=0 pw=0 time=614 us cost=4 size=49 card=1)(object id 63685210)
    254919     254919     254919               HASH JOIN RIGHT OUTER (cr=99302 pr=0 pw=0 time=11729251 us cost=13537 size=2805586 card=17426)
        59         59         59                INDEX RANGE SCAN FND_LOOKUP_VALUES_U1 (cr=20 pr=0 pw=0 time=445 us cost=4 size=49 card=1)(object id 63685210)
    254919     254919     254919                NESTED LOOPS  (cr=99282 pr=0 pw=0 time=11653162 us cost=13533 size=1951712 card=17426)
         1          1          1                 NESTED LOOPS OUTER (cr=116 pr=0 pw=0 time=113273 us cost=3 size=40 card=1)
         1          1          1                  NESTED LOOPS  (cr=113 pr=0 pw=0 time=113227 us cost=2 size=32 card=1)
         1          1          1                   INDEX UNIQUE SCAN HR_ALL_ORGANIZATION_UNTS_TL_PK (cr=110 pr=0 pw=0 time=113164 us cost=1 size=17 card=1)(object id 63680720)
         1          1          1                   TABLE ACCESS BY INDEX ROWID HR_ALL_ORGANIZATION_UNITS (cr=3 pr=0 pw=0 time=59 us cost=1 size=15 card=1)
         1          1          1                    INDEX UNIQUE SCAN HR_ORGANIZATION_UNITS_PK (cr=2 pr=0 pw=0 time=7 us cost=0 size=0 card=1)(object id 42789144)
         1          1          1                  TABLE ACCESS BY INDEX ROWID HR_LOCATIONS_ALL (cr=3 pr=0 pw=0 time=42 us cost=1 size=8 card=1)
         1          1          1                   INDEX UNIQUE SCAN HR_LOCATIONS_PK (cr=2 pr=0 pw=0 time=7 us cost=0 size=0 card=1)(object id 42797079)
    254919     254919     254919                 TABLE ACCESS BY INDEX ROWID PO_HEADERS_ALL (cr=99166 pr=0 pw=0 time=11505632 us cost=13530 size=1254672 card=17426)
    255397     255397     255397                  INDEX SKIP SCAN PO_HEADERS_ALL_X3 (cr=1753 pr=0 pw=0 time=725236 us cost=352 size=0 card=37674)(object id 42773719)
    254883     254883     254883            INDEX UNIQUE SCAN AP_TERMS_TL_U1 (cr=179785 pr=0 pw=0 time=183291 us cost=0 size=8 card=1)(object id 42798416)
    482528     482528     482528           TABLE ACCESS FULL AP_SUPPLIER_SITES_ALL (cr=28861 pr=6 pw=0 time=227983 us cost=13727 size=4323123 card=480347)
    792959     792959     792959          TABLE ACCESS BY INDEX ROWID PO_LINES_ALL (cr=1432184 pr=128 pw=0 time=53002963 us cost=5 size=96 card=3)
    793375     793375     793375           INDEX RANGE SCAN PO_LINES_U2 (cr=504726 pr=20 pw=0 time=17603112 us cost=2 size=0 card=5)(object id 42755253)
    755121     755121     755121         TABLE ACCESS BY INDEX ROWID PO_LINE_LOCATIONS_ALL (cr=3342985 pr=5897 pw=0 time=71357938 us cost=4 size=68 card=1)
   1138558    1138558    1138558          INDEX RANGE SCAN PO_LINE_LOCATIONS_N15 (cr=1707311 pr=5830 pw=0 time=37903421 us cost=3 size=0 card=2)(object id 63697005)
    723002     723002     723002        VIEW PUSHED PREDICATE  HR_LOCATIONS (cr=295330 pr=0 pw=0 time=11391536 us cost=2 size=2 card=1)
    723002     723002     723002         NESTED LOOPS  (cr=295330 pr=0 pw=0 time=11004720 us cost=2 size=18 card=1)
    723002     723002     723002          INDEX UNIQUE SCAN HR_LOCATIONS_ALL_TL_PK (cr=146911 pr=0 pw=0 time=1391389 us cost=1 size=10 card=1)(object id 42783719)
    723002     723002     723002          TABLE ACCESS BY INDEX ROWID HR_LOCATIONS_ALL (cr=148419 pr=0 pw=0 time=9233363 us cost=1 size=8 card=1)
    723002     723002     723002           INDEX UNIQUE SCAN HR_LOCATIONS_PK (cr=117800 pr=0 pw=0 time=836734 us cost=0 size=0 card=1)(object id 42797079)
    755119     755119     755119       INDEX UNIQUE SCAN HR_LOCATIONS_ALL_TL_PK (cr=122637 pr=0 pw=0 time=829404 us cost=0 size=20 card=2)(object id 42783719)
    755121     755121     755121      INDEX UNIQUE SCAN HR_LOCATIONS_ALL_TL_PK (cr=29367 pr=0 pw=0 time=716408 us cost=0 size=20 card=2)(object id 42783719)
    768883     768883     768883     TABLE ACCESS BY INDEX ROWID PO_DISTRIBUTIONS_ALL (cr=1713352 pr=895 pw=0 time=75314769 us cost=3 size=17 card=1)
    768883     768883     768883      INDEX RANGE SCAN PO_DISTRIBUTIONS_N1 (cr=1096671 pr=874 pw=0 time=24392643 us cost=2 size=0 card=1)(object id 42782429)

The plan is a bit long, but you may recall that a query with scalar subqueries in the select list reports the plans for each of the separate scalar subqueries before reporting the main query block – and I’ve inserted blank lines in the output above to improve the visibility of the individual blocks / scalar subqueries.

An odd little detail of this tkprof output was that there was no report of the wait information recorded against the query, though the following information appeared as the summary for the trace file, giving us a very good idea of the wait events for the individual query:

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        6      0.85       2.14          0          6          0           0
Execute      6      0.00       0.00          0          7        104          85
Fetch    50358    279.03    1059.39     179103   30146895          0      755329
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    50370    279.88    1061.54     179103   30146908        104      755414

Misses in library cache during parse: 3

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                   50363        0.00          0.00
  SQL*Net message from client                 50362      157.17        227.70
  row cache lock                                141        0.03          0.67
  library cache lock                             77        0.01          0.21
  library cache pin                              75        0.01          0.27
  Disk file operations I/O                      791        0.00          0.01
  gc current block 3-way                     835881        0.15        305.35
  gc current block 2-way                     471360        0.24        144.04
  KJC: Wait for msg sends to complete            40        0.00          0.00
  gc cr multi block request                       8        0.00          0.00
  gc current block congested                  10014        0.03          4.23
  gc cr block 3-way                           20215        0.06          4.69
  gc current grant busy                          20        0.00          0.00
  gc cr grant 2-way                          165010        0.07         25.13
  db file sequential read                    179103        0.05        196.31
  gc cr grant congested                         729        0.19          0.36
  gc current block busy                       71431        0.05        118.15
  gc cr block 2-way                            1800        0.01          0.31
  latch free                                      3        0.00          0.00
  gc cr block congested                         197        0.01          0.06
  latch: cache buffers chains                    45        0.00          0.00
  latch: gc element                              15        0.00          0.00
  gc cr block busy                               15        0.02          0.07
  latch: object queue header operation            1        0.00          0.00
  KSV master wait                                 2        0.00          0.00
  ASM file metadata operation                     1        0.00          0.00
  SQL*Net more data to client                     1        0.00          0.00
  gc current grant 2-way                          6        0.00          0.00

An important initial observation is that the query returned 750,000 rows in 50,000 fetches (all figures rounded for convenience) and that’s consistent with the SQL*Plus default arraysize of 15. So there might be a little time saved by setting the arraysize to a larger value (but only a few 10s of seconds – based on the 227 seconds total minus the 157 second maximum wait for the “SQL*Net message from client” figures and there may be some benefit of increasing the SQL*net SDU_SIZE at the same time). Critically, though, we should ask “why do you want a query to return 750,000 rows?”, and “how fast do you think is ‘reasonable’?” You’ll also note from the “gc” waits that the system is based on RAC with at least 3 nodes – and RAC is always a suspect when you see unexpected time spent in a query.

Where in the driving query block does most of the time go between the last hash join (line 62) and the hash unique (line 61) – it’s in the query block whose plan starts at line 28 where we see 163,000 physical blocks read (pr=) and 535 seconds (time= microseconds) of which 6,400 blocks come from the index range scan operation at line 32 but most comes from line 31 fetching 1 million rows (by index rowid) from table po_lines_locations_archive_all.

    672743     672743     672743  TABLE ACCESS BY INDEX ROWID PO_LINE_LOCATIONS_ARCHIVE_ALL (cr=5507736 pr=163043 pw=0 time=535914552 us cost=3 size=27 card=1)
    672743     672743     672743   INDEX UNIQUE SCAN PO_LINE_LOCATIONS_ARCHIVE_U1 (cr=4560824 pr=163043 pw=0 time=533161038 us cost=2 size=0 card=1)(object id 42811947)
    755121     755121     755121    SORT AGGREGATE (cr=3540960 pr=163043 pw=0 time=530079821 us)
   1040963    1040963    1040963     TABLE ACCESS BY INDEX ROWID PO_LINE_LOCATIONS_ARCHIVE_ALL (cr=3540960 pr=163043 pw=0 time=534243973 us cost=5 size=15 card=1)
   1776649    1776649    1776649      INDEX RANGE SCAN PO_LINE_LOCATIONS_ARCHIVE_U1 (cr=1123074 pr=6392 pw=0 time=37128373 us cost=3 size=0 card=2)(object id 42811947)

This part of the workload comes from 672,743 executions of the subquery starting at line 36 of the original query text:

           (SELECT pla.promised_date
              FROM apps.po_line_locations_archive_all pla
             WHERE     pla.po_header_id = pha.po_header_id
                   AND pla.po_line_id = pll.po_line_id
                   AND pla.line_location_id = ploc.line_location_id
                   AND pla.revision_num =
                       (SELECT MIN (revision_num)
                          FROM apps.po_line_locations_archive_all plla2
                         WHERE     plla2.promised_date IS NOT NULL
                               AND plla2.line_location_id =
                                   ploc.line_location_id))

If we want to improve the performance of this query with a minimum of re-engineering, recoding and risk then a good point to start would be to examine this query block in isolation and see if there is a simple, low-cost way of improving its efficiency. (Note: this may not be a route to optimising the whole query “properly”, but it may give a quick win that is “good enough”.)

We could go a little further down this route of optimising the scalar subqueries by looking at the time spent in each of them in turn. Taking out the top line of each of the separate sections of the plan and extracting just the pr, pw and time values (which I’ll scale back from microseconds to seconds) we get the following

pr=      0      pw=0    time=   0
pr=     28      pw=0    time=  23
pr=      0      pw=0    time=   2
pr=      0      pw=0    time=   2
pr= 163043      pw=0    time= 536
pr=   3564      pw=0    time=  64
pr=      1      pw=0    time=   0
pr=   5535      pw=0    time= 182
pr=      6      pw=0    time=   9
pr=      0      pw=0    time=   1

The 8th scalar subquery (line 42 in the plan, line 75 in the query) gives us an opportunity to reduce the run time by 182 seconds, so might be worth a little investment in programmer time.

The 6th subquery (line 34 in the plan, line 49 in the query) adds only 64 seconds to the run time, so we might be less inclined to do anything about it.

You might note that the 2nd, 3rd and 4th subqueries are against the same table with the same predicate to get three different columns – this group is the “obvious” choice for recoding as a single join rather than three separate subqueries, but if you look at the total times of the three subqueries the “extra” two executions add only two seconds each to the total time – so although the this scalar subquery coding pattern is undesirable, it’s not necessarily going to be worth expending the effort to rewrite it in this case.

If you’re wondering, by the way, why different subqueries are reporting different numbers of rows returned (and each one should return at most one row on each execution), there are two reasons for any subquery to be reporting fewer than the 768,890 rows reported by the basic driving hash join:

  • first – an execution may simply return no rows,
  • secondly – there may be some benefits from scalar subquery caching.

One of the nice details about newer versions of Oracle is that the “starts” statistic is also reported in the trace/tkprof output so you would be able to see how much your query had benefited from scalar subquery caching.

If we add together the time reported by each of the scalar subquery sections of the plan the total time reported is approximately 819 seconds. Cross-checking with the difference in the times reported for operations 61 and 62 (hash unique of hash join) we see: 1,059 seconds – 245 seconds = 814 seconds. This is a good match (allowing for the accumulation of a large number of small errors) for the 819 seconds reported in the subqueries – so the hash unique isn’t a significant part of the query even though it has virtually no effect on the volume of data. You’ll note that it didn’t spill to disc (pw = 0) but completed in memory.

Summary

I’ve written a quick note on this query because the coding style was undesirable and the execution plan quite lengthy. I’ve reviewed how the style of the SQL is echoed in the shape of the plan. I’ve then pursued the idea of optimising the code piece-wise to see if there were any opportunities for improving the performance “enough” without going through the effort of a complete redesign of the query.

Given the information in the Rowsource Execution from the trkprof output it proved easy to identify where the largest amounts of times appeared that might be reduced by very localised optimsation.

In passing I pointed out the option for reducing the time spent on network traffic by increasing the array fetch size, and increasing the SDU_SIZE for the SQL*Net messages to client.

Ubuntu raspberry pi: upgrade to 22.04...

Dietrich Schroff - Wed, 2022-09-28 14:13

Ubuntu released version 22.04 so i decided to make an update from

Ubuntu 21.04 (GNU/Linux 5.11.0-1027-raspi aarch64)

to

Ubuntu 22.04.1 LTS (GNU/Linux 5.15.0-1015-raspi aarch64)

But this was not so easy as i thought. Running on my raspberry pi the following services were running:

  • influxdb
  • collectd
  • telegraf
  • mosquitto
  • zigbee2mqtt
  • grafana

Without any problem only grafana was updated.

Collectd failed with this message:

Package 'collectd' has no installation candidate

no chance to fix that :(

With that i had to disable the collectd section in influxdb - with that session it failed with 

influxd-systemd-start.sh[2293]: run: open server: open service: Stat(): stat /usr/share/collectd/types.db: no such file or directory

Then zigbee2mqtt was not able to write to mosquitto. This is due a change of the default settings of mosquitto. allow_anonymous false is now default, so i had to add

allow_anonymous true

After that zigbee2mqtt was able to write data to mqtt again.

Last thing: restart of telegraf, because just did not start properly after the first reboot after the upgrade.

Not really a good update - my other raspberry pi will stay on 21.04 for some more months...

How to Sense-Check Your Data Science Findings

Rittman Mead Consulting - Wed, 2022-09-28 04:30

Introduction

One of the most common pitfalls in data science is investing too much time investigating a dead-end. It happens to the best of us; you're convinced that you've found something amazing and profound that will revolutionise the client's business intelligence...And then, in an awkward presentation to a senior stakeholder, somebody points out that your figures are impossible and somewhere along the way you've accidentally taken a sum instead of an average. Here are a few tricks you can and should use to make this kind of embarrassment less likely.

Visualise Early and Often

The sooner you plot your data, the quicker you will notice anomalies. Plot as many variables as you have time for, just to familiarise yourself with the data and check that nothing is seriously wrong. Some useful questions to ask yourself are:

• Is anything about these data literally impossible? E.g., is there an 'age' column where someone is over 200 years old?
• Are there any outliers? Sometimes data that has not been adequately quality-checked will have outliers due to things like decimal-point errors.
• Are there duplicates? Sometimes this is to be expected, but you should keep it in mind in case you end up double-counting duplicate entries.

If It's Interesting, It's Suspicious

I once worked on a project that found that social deprivation scores negatively correlated with mental health outcomes, i.e., more deprived groups had better mental health scores. This was exactly the kind of surprising result that would have made for a great story, with the unfortunate caveat that it wasn't at all true.
It turned out that I had mis-coded a variable; all my zeroes were ones, and all my ones were zeroes. Fortunately I spotted this before the time came to present my findings, but this story illustrates an important lesson:
Findings that are interesting and findings that are erroneous share a common property: both are surprising.

Talk to Subject-Matter Experts

'Ah, this is impossible- this number can never go above ten' is one of the most heart-breaking sentences I've ever heard in my career.

It's often the case that someone more familiar with the data will be able to spot an error that an analyst recently brought onto a project will miss. It is essential to consult subject-matter experts often to get their eyes on your findings.

Check for Missing Data

There is a legendary story in data collection about Survivorship Bias. In WWII, a group called the Statistical Research Group was looking to minimise bombers lost to enemy gunfire. A statistician name Abraham Wald made the shrewd observation that reinforcements should be added to the sections of planes that returned from missions without bullet holes in them, rather than- as intuition might suggestion- the parts riddled with bullets. This was because those planes were the ones that returned safely, and so the parts that were hit were not essential to keeping the pilot alive.

https://en.wikipedia.org/wiki/Survivorship_bias

Missing data can poison a project's findings. There are a lot of reasons data could be missing, some more pernicious than others. If data is missing not-at-random (MNAR) it may obscure a systematic issue with data collection. It's very important to keep this in mind with something like survey data, where you should expect people who abandon the survey to be qualitatively different to people who complete it.

Understand Where the Data Came From and Why

Did you assemble this dataset yourself? If so, how recently did you assemble it, and do you still remember all of the filters and transformations you applied to get it in its current form? If someone else assembled it, did they document what steps they took and are they still around to ask? Sometimes a dataset will be constructed with a certain use case in mind different from your use case, and without knowing the logic that went into making it you run the risk of building atop a foundation of errors.

Beware the Perils of Time Travel

This one isn't a risk for all kinds of data, but it's a big risk for data with a date component- especially time series data. In brief, there is a phenomenon called Data Leakage wherein a model will be built such that it unintentionally cheats by using future data to predict the past. This trick only works when looking retrospectively because in the real world, we cannot see the future. Data leakage is a big enough topic to deserve its own article, but just be aware that you should look it up before building any machine learning models.

Conclusion

It is impossible to come up with a fully-general guard against basic errors, and kidding yourself into thinking you have one will only leave you more vulnerable to them. Even if you do everything right, some things are going to slip you by. I encourage you to think of any additions you might have to this list.

Categories: BI & Warehousing

Separation of Duties (Poll Results Discussed)

Tim Hall - Wed, 2022-09-28 03:12

On the back of the recent patching polls I asked a couple of questions about separation of duties. As always, the sample size is small and my followers have an Oracle bias, so you can decide how representative you think these number are… Separation of Duties Here was the first question. This is exactly what … Continue reading "Separation of Duties (Poll Results Discussed)"

The post Separation of Duties (Poll Results Discussed) first appeared on The ORACLE-BASE Blog.Separation of Duties (Poll Results Discussed) was first posted on September 28, 2022 at 9:12 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

Joel Kallman Day 2022 : Announcement

Tim Hall - Tue, 2022-09-27 02:03

Since 2016 we’ve had an Oracle community day where we push out content on the same day to try and get a bit of a community buzz. The name has changed over the years, but in 2021 it was renamed to the “Joel Kallman Day”. Joel was big on community, and it seems like a … Continue reading "Joel Kallman Day 2022 : Announcement"

The post Joel Kallman Day 2022 : Announcement first appeared on The ORACLE-BASE Blog.Joel Kallman Day 2022 : Announcement was first posted on September 27, 2022 at 8:03 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

HTMX: Events and Triggers - Django CRUD, part 5

Andrejus Baranovski - Mon, 2022-09-26 08:16
Refreshing dependent fragments with data is a very common use case in enterprise applications. In this example, data is changed in editable form. If data is saved to DB successfully through Django backend, we raise HTMX event and with HTMX trigger refresh dependent readonly table to display the latest data.

 

Oracle Database Patching (Poll Results Discussed)

Tim Hall - Mon, 2022-09-26 02:05

Having recently put out a post about database patching, I was interested to know what people out in the world were doing, so I went to Twitter to ask. As always, the sample size is small and my followers have an Oracle bias, so you can decide how representative you think these number are… Patching … Continue reading "Oracle Database Patching (Poll Results Discussed)"

The post Oracle Database Patching (Poll Results Discussed) first appeared on The ORACLE-BASE Blog.Oracle Database Patching (Poll Results Discussed) was first posted on September 26, 2022 at 8:05 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

Running x86_64 Docker Images on Mac M1 Max (Oracle Database 19c)

DBASolved - Thu, 2022-09-22 14:42

  A few months ago, I wrote a post about me switching back to Windows (here).  I can say I […]

The post Running x86_64 Docker Images on Mac M1 Max (Oracle Database 19c) appeared first on DBASolved.

Categories: DBA Blogs

A Few Words About OAC Embedding

Rittman Mead Consulting - Thu, 2022-09-22 12:10
TL;DR To exit VIM you press Esc, then type :q! to just exit or :wq to save changes and exit and then press Enter.

Some time ago and by that I mean almost exactly approximately about two years ago Mike Durran (https://insight2action.medium.com) wrote a few blogs describing how to embed Oracle Analytics Cloud (OAC) contents into public third-party sites.

Oracle Analytics Cloud (OAC) Embedding— Public User Access — Part 1
Introduction
Oracle Analytics Cloud (OAC) Embedding — Public User Access — Part 2
Introduction

For anyone who needs to embed OAC reports into their sites, these blogs are a must-read and a great source of valuable information. Just like his other blogs and the official documentation, of course.

Visualizing Data and Building Reports in Oracle Analytics Cloud
The topics in this section explain how to use the JavaScript embedding framework to embed Oracle Analytics content into applications and web pages.

If you have ever tried it, you most likely noticed that the embedding process is not exactly easy or intuitive. Roughly it consists of the following steps:

  1. Create content for embedding.
  2. Setup infrastructure for authentication:
    2.1. Create an Oracle Identity Cloud Service (IDCS) application.
    2.2.Create an Oracle Functions function.
    2.3. Set up Oracle API Gateway.
  3. Embed JavaScript code to the third-party site.

Failing to implement any of the above leads to a fully non-functional thing.

And here is the problem: Mike knows this well. Too well. Some things that are entirely obvious to him aren't obvious to anyone trying to implement it for the first time. When you know something at a high level, you tend to skip bits and bobs here and there and various tasks look easier than they are.

A small story When I was studying at the university, our techer told us a story. Her husband was writing a math book for students and wrote the infamous phrase all students love: "... it is easy to prove that ...". She said to him that, if it was easy to prove, he should do it.

He spent a week proving it.

That is why I think that I can write something useful on this topic. I'm not going to repeat everything Mike wrote, I'm not going to re-write his blog. I hope that I can fill in a few gaps and show some it is easy to do things.

Also, this blog is not intended to be a complete step-by-step guide. Or, at least, I have no intention of writing such a thing. Although, it frequently happens that I'm starting to write a simple one-paragraph hint and a few hours later I'm still proofreading something with three levels of headers and animated screen captures.

Disclaimer. This blog is not a critique of Mike's blog. What he did is hard to overestimate and my intention is just to fill some gaps.

Not that I needed to make the previous paragraph a disclaimer, but all my blogs have at least one disclaimer and once you get locked into a serious disclaimers collection, the tendency is to push it as far as you can.

Testing out Token Generation

My main problem with this section is the following. Or, more precisely, not a problem but a place that might require more clarification in my opinion.

You’ll see that the token expires in 100 seconds and I describe how to increase that in a separate blog. For now, you can test this token will authenticate your OAC embedded content by copying the actual token into the following example HTML and deploying on your test web server or localhost (don’t forget to add suitable entries into the OAC safe domains page in the OAC console)

I mean why exactly 100 seconds is a bad value? What problem does increasing this value solve? Or, from the practical point of view, how do we understand that our problem is the token lifespan?

It is easy and confusing at the same time. The easy part is that after the token is expired, no interaction with the OAC is possible. It is not a problem if you embed non-interactive content. If the users can only watch but do not touch, the default value is fine. However, if the users can set filters or anyhow interact with reports, tokens must live longer than the expected interaction time.

Here is what it looks like when the token is OK:

And the same page a few minutes later:

Assuming that we don't know the right answer and need to find it, how do we do it? The browser developer console is your friend! The worst thing you can do to solve problems is to randomly change parameters and press buttons hoping that it will help (well, sometimes it does help, but don't quote me on that). To actually fix it we need to understand what is going on.

To be fair, at first sight, the most obvious and visible message is totally misleading. Normally, we go to the Console tab (Ctrl+Shift+J/Command+Option+J) and read what is written there. But if the token is expired, we get this:

The console shows multiple CORS errors: Access to XMLHttpRequest at 'https://OAC-INSTANCE-NAME.analytics.ocp.oraclecloud.com/ui/dv/ui/api/v1/sessioninfo/ext' from origin 'https://THIRD-PARTY-SITE' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. CORS stands for Cross-Origin Resource Sharing. In short, CORS is a security mechanism implemented in all modern browsers which allows for specifying if content from one server may be embedded into another server.

So looking at this we might assume that the solution would be either specify Safe domains in OAC or set CORS policy for our Web server, or both. In reality, this message is misleading. The real error we can get from the Network tab.

Let's take a look at the first failed request.

Simply click it once and check the Headers tab. Here we can clearly see that the problem is caused by the token, not by CORS. The token is expired.

The same approach shows when there is something wrong with the token. For example, once I selected a wrong OAC instance for the Secure app. Everything was there. All options were set. All privileges were obtained. The token generation was perfect. Except it didn't work. The console was telling me that the problem was CORS. But here I got the real answer.

Oracle Functions Service

I feel like this is the part which can use more love. There are a few easy-to-miss things here.

And the most important thing is why do we need Oracle Functions at all? Can't we achieve our goal without Functions? And the answer is yes, we can. Both Oracle Functions and API Gateways are optional components.

In theory, we can use the Secure application directly. For example, we can set up a cron job that will get the token from the Secure application and then embed the token directly into static HTML pages using sed or Python or whatever we like. It (theoretically) will work. Note, that I didn't say it was a better idea. Or even a good one. What I'm trying to say is that Functions is not an essential part of this process. We use Oracle Functions to make the process more manageable, but it is only one of the possible good solutions, not the only one.

So what happens at this step is that we are creating a small self-containing environment with a Node.js application running in it. It all is based on Docker and Fn Project, but it is not important to us.

The function we are creating is a part required to simplify the result.

High-level steps are:

  1. Create an application.
  2. Open the application and either use Cloud Shell (the easy option) or set up a development machine.
  3. Init a boilerplate code for the function.
  4. Edit the boilerplate code and write your own function.
  5. Deploy the code.
  6. Run the deployed code.

Creating a new function is easy.  Go to Developer Services -> Applications

Create a new function and set networking for it. The main thing to keep in mind here is that the network should have access to Oracle Cloud Infrastructure Registry. If it doesn't have access, you'll get Fn: Error invoking function. status: 502 message: Failed to pull function image error message when trying to run the function: Issues invoking functions.

The first steps with Oracle functions are simple and hard at the same time. It is simple because when you go to Functions, you see commands which should be executed to get it up and running. It is hard because it is not obvious what is happening and why. And, also, diagnostics could've been better if you ask me.

After you create an application, open it, go to the Getting started, press Launch Cloud Shell and do what all programmers do: copy and paste commands trying to look smart and busy in the process. Literally. There are commands you can copy and paste and get a fully working Hello World example written in Java. Just one command has a placeholder to be changed.

Hint: to make your life easier first do step #5 (Generate an Auth Token) and then come back to the steps 1-4 and 6-11.

If everything is fine, you will see "Hello, world!" message. I wonder, does it make me a Java developer? At least a junior? I heard that is how this works.

OK, after the Java hello-world example works, we can add Node.js to the list of our skills. Leave the Java hello-world example and initialize a new node function:

cd
fn init --runtime node embed_func

This creates a new Node.js boilerplate function located in the embed_func directory (the actual name is not important you can choose whatever you like).  Now go to this directory and edit the func.js file and put Mike's code there.

cd embed_func
vim func.js

- do some vim magic
- try to exit vim

I don't feel brave enough to give directions on using vim. If you don't know how to use vim but value your life or your reason, find someone who knows it.

But because I know that many wouldn't trust me anyways, I can say that to start editing the text you press i on the keyboard (note -- INSERT -- in the bottom of the screen) then to save your changes and exit press Esc (-- INSERT -- disappears) and type :wq and press Enter. To exit without saving type :q! and to save without exiting - :w . Read more about it here: 10 Ways to Exit Vim Editor

Image source: https://www.linuxfordevices.com/tutorials/linux/exit-vim-editor

Most likely, after you created a new node function, pasted Mike's code and deployed it, it won't work and you'll get this message: Error invoking function. status: 504 message: Container failed to initialize, please ensure you are using the latest fdk and check the logs

I'm not a Node.js pro, but I found that installing NOT the latest version of the node-fetch package helps.

cd embed_func
npm install node-fetch@2.6.7

At the moment of writing this, the latest stable version of this package is 3.2.10: https://www.npmjs.com/package/node-fetch. I didn't test absolutely all versions, but the latest 2.x version seems to be fine and the latest 3.x version doesn't work.

If everything was done correctly and you managed to exit vim, you can run the function and get the token.

fn invoke <YOUR APP NAME> <YOUR FUNCTION NAME>

This should give you a token every time you run this. If it doesn't, fix the problem first before moving on.

Oracle API Gateway

API Gateway allows for easier and safer use of the token.

Just like Functions, the API Gateways is not an essential part. I mean after (if) we decided to use Oracle Functions, it makes sense to also use Gateways. Setting up a gateway to call a function only takes a few minutes, no coding is required and things like CORS or HTTPS are handled automatically. With this said API Gateways is a no-brainer.

In nutshell, we create an URL and every time we call that URL we get a token. It is somewhat similar to where we started. If you remember, the first step was "creating" an URL that we could call and get a token. The main and significant difference is that now all details like login and password are safely hidden behind the API Gateway and Oracle Functions.

Before Functions and Gateway it was:

curl --request POST \
 --url https://<IDCS-domain>.identity.oraclecloud.com/oauth2/v1/token \
 --header 'authorization: Basic <base64 encoded clientID:ClientSecret>' \
 --header 'content-type: application/x-www-form-urlencoded;charset=UTF-8' \
 -d 'grant_type=password&username=<username>&password=<password>&scope=\
 <scope copied from resource section in IDCS confidential application>'

With API Gateways the same result can be achieved by:

curl --request https://<gateway>.oci.customer-oci.com/<prefix>/<path>

Note, that there are no longer details like login and password, clientID and ClientSecret for the Secure application, or internal IDs. Everything is safely hidden behind closed doors.

API Gateways can be accessed via the Developer Services -> [API Management] Gateways menu.

We click Create Gateway and fill in some very self-explanatory properties like name or network. Note, that this URL will be called from the Internet (assuming that you are doing this to embed OAC content into a public site) so you must select the network accordingly.

After a gateway is created, go to Deployments and create one or more, well, deployments. In our case deployment is a call of our previously created function.

There are a few things to mention here.

Name is simply a marker for you so you can distinguish one deployment from another. It can be virtually anything you like.

Path prefix is the actual part of the URL. This has to follow rather strict URL rules.

The other very important thing is CORS. At the beginning of this blog I already mentioned CORS but that time it was a fake CORS message. This time CORS is actually important.

If we are embeddig OAC content into the site called https://thirdparty.com, we must add a CORS policy allowing us to do so.

If we don't do it, we will get an actual true authentic CORS error (the Network tab of the browser console):

The other very likely problem is after you created a working function, exited vim, created a gateway and deployment, and defined a deployment, you are trying to test it and get an error message {"code":500,"message":"Internal Server Error"}:

If you are getting this error, it is possible that the problem is caused by a missing policy:

Go to

And create policy like this:

ALLOW any-user to use functions-family in compartment <INSERT YOUR COMPARTMENT HERE> where ALL { request.principal.type= 'ApiGateway'}

A few minor things

It is rather easy to copy pieces of embedding code from the Developer menu. However, by default this menu option is disabled.

It can be enabled in the profile. Click your profile icon, open Profile then Advanced and Enable Developer Options. It is mentioned in the documentation but let's be real: nobody reads it.

If you simply take the embedding script, it won't work.

This code lacks two important modules: jquery and obitech-application/application. If either of them is missing you will get this error: Uncaught TypeError: application.setSecurityConfig is not a function. And by the way, the order of these modules is not exactly random. If you put them in an incorrect order, you will likely get the same error.

As a conclusion

After walking this path with a million ways to die we get this beautifully looking page: Niðurstaða stafrænna húsnæðisáætlana 2022

https://hms.is/husnaedi/husn%C3%A6%C3%B0isa%C3%A6tlanir/m%C3%A6labor%C3%B0-husn%C3%A6%C3%B0isa%C3%A6tlana/ni%C3%B0ursta%C3%B0a-stafr%C3%A6nna-husn%C3%A6%C3%B0isa%C3%A6tlana-2022
Categories: BI & Warehousing

OAC Semantic Modeler and Version Control with Git

Rittman Mead Consulting - Wed, 2022-09-21 12:35

This is my third blog post in the series of posts about OAC's Semantic Modeler. The first one was an overview of the new Semantic Modeler tool, the second was about the new SMML language that defines Semantic Modeller's objects. This post is about something that OBIEE developer teams have been waiting for years - version control. It looks like the wait is over - Semantic Modeler comes with native Git support.

When you open Semantic Modeler from OAC, you will see two toggle buttons in the bottom right corner:

The right toggle is for Git Panel, where version control magic takes place.

Enabling Git for a Semantic Model

Version control with Git can be enabled for a particular Semantic Model, not the whole Modeller repository. When first opening the Git Panel, it will inform you it requires configuration.

Click Start and you will be asked for a Git Repository URL and the name of the main branch. I created my test repository on Github but you may have your own company internal Git server. The easiest way to establish version control for a Model is to create an empty Git repository beforehand - that is what I did. In the "Initialize Git" prompt, I copied the full https URL of my newly created, empty (no README.md in it!) Github repository and clicked "Continue".

If the repository URL is recognised as valid, you will get the next prompt to choose a Git profile, which is your Git logic credentials. To create a new profile, add your git user name and password (or Personal Access Token if you are using Github) to it and name your profile.

Click "Initialize Git". After a short while, a small declaration of success should pop up...

... and the Git Panel will now have a typical set of Git controls and content.

Next, let us see it in action.

Git and Semantic Modeler - the Basics

The basics of Semantic Modeler's version control are quite intuitive and user friendly. Let us start by implementing a simple change to our Customers dimension, let us rename a column.

We type in the column name, press Enter. We see that the Unstaged Changes list in the Git Frame is still empty. We press Ctrl+S to save our changes and the Unstaged Changes list gets updated straight away.

We click on "Stage All". At the bottom of the Git panel, "Commit description" input and "Commit" button appear.

We enter a description, click "Commit" and get a message:

However, the changes have not yet been pushed to the Git server - we need to push them by clicking the "Push" button.

Now let us check the repository content in the Git server.

We can see the "Dim - Customers.json" SMML file has just been updated.

Git and Semantic Modeler - Working with Branches

At Rittman Mead we are evangelists of Gitflow - it works well with multiple developers working independently on their own features and allow us to be selective about what features go into the next release. The version control approach we have developed for OBIEE RPD versioning as part of our BI Developer Toolkit relies on Gitflow. However, here it is not available to us. No worries though - where there is a will, there is a way. Each of our devs can still have their own branch.

Before we start playing with branches, let us make sure our main branch is saved, checked in and pushed. To create a new branch, we click on the "Create Local Branch" button.

We base it on our current "main" branch. We name it "dev-janis" and click "Create".

If successful, the Current branch caption will change from "main" to "dev-janis". (An important part of version control discipline is to make sure we are working with the correct branch.)

In our dev branch, let us rename the "LAST_NAME" column to "Last Name".

Save.

Stage. Commit. Push.

Once pushed, we can check on the Git server, whether the new branch has been created and can explore its content.

We can also switch back to the "main" branch to check that the "LAST_NAME" column name remains unchanged there.

Git and Semantic Modeler - Merge Changes

The point of having multiple dev branches is to merge them at some point. How easy is that?

Let us start with changes that should merge without requiring conflict resolution.

In the previous chapter we have already implemented changes to the "dev-janis" branch. We could merge it to the "main" branch now but Git would recognise this to be a trivial fast-forward merge because the "main" branch has seen no changes since we created the "dev-janis" branch. In other words, Git does not need to look at the repository content to perform a 3-way merge - all it needs to do is repoint the "main" branch to the "dev-janis" branch. That is too easy.

Before merging, we will implement a change in the "main" branch.

We switch to the "main" branch.

We rename the "INITIALS" column to "Initials".

Save. Stage. Check in. Push.

Let us remind ourselves that in the "dev-janis" branch, the INITIALS column is not renamed and the LAST_NAME column is - there should be no merge conflicts.

To merge the "dev-janis" branch into the "main" branch, we switch to the "main" branch. We click the "Merge" button.

We select the Merge branch to be "dev-janis" and click "Merge".

The Merge Strategy value applies to conflict resolution. In our simple case, there will be no merge conflicts so we leave the Strategy as Default.

After the merge, I could see moth the "INITIALS" and the "LAST_NAME" columns renamed - the merge worked perfectly!

Save. Stage. Check In. Push.

Well, how easy was that!? Anybody who has managed an OBIEE RPD multidev environment will the new Semantic Modeler.

Git and Semantic Modeler - Merge Conflict Resolution

At last we have come to merges that require conflict resolution - the worst nightmare of OBIEE RPD multidev projects. How does it work with OAC's Semantic Modeler?

Let us create a trivial change that will require conflict resolution - we will rename the same column differently in two branches and then will merge them. The default Git merge algorithm will not be able to perform an automatic 3-way merge.

We use our trusted Customers dimension, we select the "main" branch and change the column "DOB" name to "Date of Birth".

"main" branch:

Save. Stage. Check in. Push.

In the "dev-janis" branch, we rename the same "DOB" column to "DoB".

"dev-janis" branch:

To merge, we switch back to the "main" branch. (Pull if necessary.) As the branch to be merged with, we choose "dev-janis".

As for Merge Strategy, we have 3 options here: Default, Ours and Theirs. (In our example, Ours would be the "main" branch whereas Theirs would be the "dev-janis".) We can use the Ours and Theirs strategies if we are 100% certain in case of a conflict we should always prefer the one or the other branch. In most cases however, we want to see what the conflicts are before deciding upon the resolution, therefore I expect the Default Strategy will almost always be used. You can read more about Merge Strategies in OAC's documentation here.

We call the Merge command.

As expected, we get merge errors - two of them. (The reason there are two is because our Business Layer column names are automatically propagated to the Presentation Layer - hence we get two errors - one from Business Layer and the other from Presentation.)

We click on the first conflict. Lo and behold - we get a proper A/B conflict merge window that does a decent job at showing us the merge conflict. However, I did find it to be a bit buggy - the "Resolve Item" and "Resolve All" buttons only work when you click on their edges. Also the A and B version toggle buttons did not work at all for me.

However, I got it working by using the Take All buttons, which worked fine for me, since there was only a single conflict to resolve. I wanted the B version so I clicked the Take All button in the B frame and then clicked the Resolve Item button and then pressed Ctrl+S. That did the trick and the error disappeared from the Merge Conflicts list. The same I did with the Presentation Layer conflict. After that, there were no more Merge conflicts in the Merge frame and I got a message there: "Merge successful".

And successful it was. This is the end result - the Customer dimension in the "main" branch after a conflict-resolved merge.

Version control merge conflict resolution can be challenging. I recommend you read the Understand and Resolve Merge Conflicts chapter from the OAC documentation.

Conclusions

As of the time of this writing, version control features like change staging, checking in, push and pull, switching between branches, appear to be rock-solid. When it came to merging and in particular merge conflict resolution, the version control functionality appears a bit more capricious but it still worked for me.

Overall, Git support in Semantic Modeler looks well designed and will be a huge improvement over OBIEE RPD versioning.

If you want to run advanced queries against your repository content from Python, if you want to auto-generate Semantic Model content with scripts, version control with Git will enable that. And it will be easier than with OBIEE RPD.

Categories: BI & Warehousing

Autonomous Database Strategic Customer Program

Tom Kyte - Wed, 2022-09-21 05:26
Is there a link to information about the ADB-S Strategic Customer Program?
Categories: DBA Blogs

HTMX: Saving Form Changes - Django CRUD, part 4

Andrejus Baranovski - Wed, 2022-09-21 03:29
I explain how to run POST request through HTMX attribute on HTML form tag to save form changes to the backend. You will see how to report validation errors through HTMX response.

 

Database Patching : It’s a difficult subject

Tim Hall - Wed, 2022-09-21 02:42

If you came hear hoping I was going to say there are valid reasons not to patch, you are out of luck. There is never a valid reason not to patch… Instead this post is more about the general approach to patching. I’ve spent 22+ years writing about Oracle, including how to install it, but … Continue reading "Database Patching : It’s a difficult subject"

The post Database Patching : It’s a difficult subject first appeared on The ORACLE-BASE Blog.Database Patching : It’s a difficult subject was first posted on September 21, 2022 at 8:42 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

Fluid Fundamentals 90-minute Webcast

Jim Marion - Tue, 2022-09-20 10:04

 I am constantly amazed at the flexibility of PeopleSoft's Fluid UX. Through AddJavaScript and AddStylesheet, we can make the user interface do or appear however we desire. But in all our cleverness, we can't neglect the fundamentals such as:

  • What are the differences between Classic and Fluid?
  • What are the purposes of the various Fluid layouts?
  • How do I align columns and fields in Fluid?
  • What do I do with related display fields? Why won't they appear beside their control fields?
  • How do you render grids on mobile?
  • Is there a place for the scroll area in Fluid?
We regularly teach Fluid development principles. As developers become familiar with Fluid's CSS capabilities, they begin writing their own CSS. Is that OK? Should we write our own CSS? Absolutely! A better question is, "when?" When should we write our own CSS? When should we inject our own JavaScript? These are the questions we answer through Fluid fundamentals. Whether you have years of experience with Fluid or you are just starting your Fluid journey, we all learn by getting back to the basics: The Fundamentals. So join us on September 29th from 1:00 to 2:30 PM Central for 90 minutes of Fluid Fundamentals. Register now!

Saving the World from Fat-finger moments – with regexp_like

The Anti-Kyte - Tue, 2022-09-20 01:30

It’s not uncommon for a database application to have it’s behaviour defined, to an extent at least, by records in reference data tables.
By it’s nature, this data is static and the tables in which it resides tend to contain comparatively few rows. However, such an approach can be susceptible to erroneous data entry, especially where key values are concerned.
Having spent many an “entertaining” afternoon/evening/dead-of-night, trying to hunt down some mystery bug, only to find that one of these values includes an extraneous space or invisible control character, I’ve come to appreciate the ability of regexp_like to point these things out.
The code examples that follow should be based on some sensible data set, probably from the HR demo schema. However, Dr Fatfinger does sound rather like a Bond villain…

Over the years, MI6 has built up a list of individuals whose activities bear close scrutiny.
They want to know if any of these people start bulk buying piranha food, or looking at hollowed out volcanoes on Zoopla :

create table megalomaniacs (
    evil_genius varchar2(100),
    start_date date,
    constraint megalomaniacs_pk primary key (evil_genius))
/

insert into megalomaniacs( evil_genius, start_date)
values('DOCTOR NO', to_date('01-OCT-1962', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('ERNST STAVRO'||chr(256)||' BLOFELD', to_date('10-OCT-1963', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values(' AURIC GOLDFINGER', to_date('17-SEP-1964', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('EMILIO LARGO', to_date('09-DEC-1965', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('DOCTOR KANANGA', to_date('27-JUN-1973', 'DD-MON-YYYY'));
	
insert into megalomaniacs( evil_genius, start_date)
values('FRANCISCO SCARAMANGA', to_date('19-DEC-1974', 'DD-MON-YYYY'));
	
insert into megalomaniacs( evil_genius, start_date)
values('KARL STROMBERG', to_date('7-JUL-1977', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('HUGO DRAX ', to_date('26-JUN-1979', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('ARISTOTLE KRISTATOS', to_date('24-JUN-1981', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('KAMAL KHAN', to_date('06-JUN-1983', 'DD-MON-YYYY'));
	
insert into megalomaniacs( evil_genius, start_date)
values('MAX ZORIN', to_date('22-MAY-1985', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('GENERAL KOSKOV', to_date('29-JUN-1987', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('FRANZ SANCHEZ', to_date('13-JUN-1989', 'DD-MON-YYYY'));
	
insert into megalomaniacs( evil_genius, start_date)
values('ALEC TREVELYAN', to_date('13-NOV-1995', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('ELLIOT CARVER', to_date('09-DEC-1997', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('ELEKTRA KING', to_date('08-NOV-1999', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('COLONEL TAN-SUN MOON', to_date('20-NOV-2002', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('MR WHITE', to_date('14-NOV-2006', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('DOMINIC GREEN', to_date('20-OCT-2008', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('RAOUL SILVA', to_date('23-OCT-2012', 'DD-MON-YYYY'));

insert into megalomaniacs( evil_genius, start_date)
values('LYUTSIFER SAFIN	', to_date('28-SEP-2021', 'DD-MON-YYYY'));

commit;

However, some of these people are slipping through the net…


select evil_genius, to_char(start_date, 'DD-MON-YYYY') as start_date
from megalomaniacs 
where evil_genius in ( 'MR WHITE', 'LYUTSIFER SAFIN', 'AURIC GOLDFINGER', 'ERNST STAVRO BLOFELD');

EVIL_GENIUS                    START_DATE          
------------------------------ --------------------
MR WHITE                       14-NOV-2006         

1 row selected. 

We suspect the handy work of Dr Fatfinger, possibly through the activities of those notorious henchpeople, Copy and Paste.

Fortunately, we can use a regexp to identify any records that contain :

  • a leading or trailing non-printing character
  • a control character
select evil_genius, length( evil_genius),
    to_char(start_date, 'DD-MON-YYYY') as start_date
from megalomaniacs
where regexp_like(evil_genius, '^[[:space:]]|[[:cntrl:]]|[[:space:]]$', 'i');

Ah, Dr Fatfinger. We’ve been expecting you !

Mock Service URLs for PeopleSoft Testing

Jim Marion - Mon, 2022-09-19 14:21

 As you create, learn, and discover Integration Broker's capabilities, it is helpful to have mock APIs for testing purposes. Here is a list of my favorites.

Many of these services have secure and non-secure alternatives. The value of insecure testing is you don't have to import certificates. On the other hand, the secure versions help you prove your PeopleSoft certificate import skills.

Do you have some sample APIs you use? If so, please share them in the comments!

At JSMpros, we teach PeopleSoft REST and SOAP integrations regularly. Want to learn more? Check out our Integration Tools Update course to learn best practices and strategies for incorporating REST and JSON into your development process!

Vagrant : “SSH auth method: private key” – Timed out…

Tim Hall - Sun, 2022-09-18 04:25

Out of nowhere I recently started to get problems with Vagrant running on a Windows 11 host. The “vagrant up” command would always hang at the “SSH auth method: private key” stage. You can see an example of the output here. default: SSH address: 127.0.0.1:2222 default: SSH username: vagrant default: SSH auth method: private key … Continue reading "Vagrant : “SSH auth method: private key” – Timed out…"

The post Vagrant : “SSH auth method: private key” – Timed out… first appeared on The ORACLE-BASE Blog.Vagrant : “SSH auth method: private key” – Timed out… was first posted on September 18, 2022 at 10:25 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

Simple Code-only REST Request

Jim Marion - Wed, 2022-09-14 16:13

PeopleSoft is a metadata-driven application. It stands to reason, therefore, that the solutions we create require metadata. Sometimes metadata, which is supposed to be configurable, forces us down an immutable (non-configurable) path. Integration Broker is a fantastic example of semi-mutable, bordering on immutable metadata. As we prototype, we often make changes. But the testing associated with prototyping locks various metadata pieces. Sometimes we just want to test without metadata and backfill after we choose the proper path.

Here is a very simple metadata-free REST request example I run from a local App Engine program for prototyping and testing purposes.

Now, an important question:

Once you have a working "simple" solution, do you backfill and update with proper Integration Broker metadata?

Let us know your thoughts in the comments below!

At JSMpros, we teach Integration Tools concepts regularly. Check out our website to see what we are offering next!

OAC Semantic Modeler and the SMML Language

Rittman Mead Consulting - Wed, 2022-09-14 05:28

There are a few reasons to like the Oracle Analytics Cloud (OAC) paired with the new Semantic Modeler:

  • It works in the cloud. I mean, it actually works. (Check my blog post for an overview of Semantic Modeler.)
  • It has a native, built in git integration. (A blog post about that is coming shortly.)
  • The SMML language.

The SMML, which is short for Semantic Modeler Markup Language, is my personal favourite feature coming with the new Semantic Modeler.

The old UDML and the new SMML

In the old days the only way to manipulate the OBIEE RPD content automatically was to export the RPD in UDML format, tweak it and then import it back. UDML was proprietary, undocumented and an utter nightmare to parse - content analysis was usually done with sophisticated regex and it did not work very well. The use of UDML was mostly "local" when you copy an RPD object into Notepad, tweak it and then copy it back into the RPD. That all changed with the XML export feature - parsing of the whole repository became feasible, almost easy. At Rittman Mead we have parsed the entire RPD content and inserted it into a relational database to make it available for further analysis like lineage or data dictionary. The XML export capability was a game changer. However, for individual RPD object manipulation we were still stuck with UDML.

In OAC and with the new Semantic Modeler, UDML is replaced with SMML. It is documented. It is JSON-based, which I usually prefer to XML. Instead of modifying UDML at our own risk, Semantic Modeler objects are available in SMML format from the Modeler itself - the Modeler features a SMML editor.

Editing SMML in Semantic Modeler

To easiest access to Semantic Modeler objects in SMML format is from the Modeler itself. You right-click a Modeller object, choose Open in SMML Editor from the pop-up menu and a new tab will open with the script.

However, this way you are accessing only one object at a time. (You cannot multi-select objects to have them in a single SMML editor view.) Moreover, when opening a high-level object like a Business Model, you will not get full SMMLs of its Logical Tables - you will have to open the SMML Editor for each Logical Table separately. Also, some low-level objects like Columns do not have their own SMML - when opening SMML Editor for a particular Column, you will get the SMML code for the whole Table.

Let us give it a try. We go to the Logical Layer, double-click the "Dim - Customers" Logical Table to open it in designer.

Now, right-click the same "Dim - Customers" table and open the SMML Editor from the menu. The two tabs will sit next to each other.

Let us change a dataType value from NUMERIC to VARCHAR:

To save the changes, we either press Ctrl+S or click the save icon in the top right corner. Now let us go back to the designer tab. We see that the data type in the designer has changed from '99' to 'ab' - without refreshing the designer we can see the change there. (Note that the same change cannot be done from the designer - data types are normally derived from the Physical sources, instead of specified explicitly.)

Can repository content be broken in SMML Editor? Let us try that by renaming a JSON key:

When trying to save it, I get this error:

No explanation, no line number. In this case, the save did not take place. I can either revert the change in the SMML Editor itself or I can close the Editor tab and choose to discard changes.

However, the way errors come up is inconsistent. For some errors, I would get this message:

When I choose to proceed (common sense says No but I'm doing this for you!), the Dim - Customers dimension disappears from the Business Model...

But it is not all lost. It is now found in the Invalid Files section.

Here I can open it and this time the row with the error is actually highlighted. I remove the unneeded commas that make the JSON invalid and save it.

However, the file is still in the Invalid Files section and still missing from the Business Model - I don't think that is how it is supposed to work. Perhaps I am missing a simple trick here but I could not find an easy way of moving the Customer dimension back to the Business Model. The best solution for me was to use the git reset command.

I will describe git integration in my next blog post.

The SMML editing works well but when you get it wrong, it does not do a good job at telling what and where the issue is. However, it does tell you there is an issue and when that happens, do not save it!

We have seen that SMML editing works well for individual objects. But how about a whole repository export and import?

Whole Repository SMML Export and Import

The documentation does not go into much detail on whole repository exports and imports. I found two ways of exporting the whole repository in SMML format and one for importing it.

The easiest way to export the whole Semantic Model is to open it in Modeler and then open the triple-dot menu from the top right corner.

There you can choose to Export and then specify the name of the Zip file.

Upon pressing Export, the zip file gets generated and your browser will download it. When unpacking the archive, this is what you see - (at least) three folders...

...with lots of JSON files in them - those are SMML scripts.

An alternative way of exporting the whole repository in SMML format is to upload it to a git repository. In git, the repository is stored in exactly the same format as the export zip file.

When the repository is in git, you can clone the repository locally and edit it with a text editor - it will be the same as with a zip export. However, when done, you can commit and push your changes back to the git repository and then pull the repository changes from Semantic Modeler - this is the only way I found to import the whole repository into Semantic Modeler.

Querying OAC Repository SMML Content with Python

Why do I love the SMML so much? Because it presents a great opportunity for repository content analysis and development acceleration.

Now that we have the entire OAC Repository exported in JSON format, we can use Python to query it or even modify its content. Python is very good at handling data in JSON format - it is much easier than XML.

Let us start with something simple - in Python we open the Customers dimension JSON to count columns in it:

import json

with open("D:\OAC-Semantic-Modeller\Repo\logical\HelloBusinessModel\Dim - Customers.json") as f:
    customerJson = json.load(f)

    print(f"Logical Table Name is {customerJson['logicalTable']['name']} and its type is {customerJson['logicalTable']['type']}.")

    logicalColumnNames = [lc['name'] for lc in customerJson['logicalTable']['logicalColumns']]
    print(f"The table has {len(logicalColumnNames)} columns:\n\t{', '.join(logicalColumnNames)}")

These are just a few lines of code and the output looks like this:

Note above that in the SMML JSON, the name attribute is a single attribute whereas logicalColumns is a list. In Python they become a single attribute and a list accordingly.

It would be easy to modify the above script to do the same for more than one table, to run the script regularly to keep track of table column changes...

However, it would be more useful to do queries across multiple repository objects. With a bit more effort, we can do that.

Let us set a task to get a list of all Logical Table Columns in the format <Logical Table name>.<Logical Column name> (<Logical Column data type>) but without opening each JSON individually like we did in the first script. Instead we want to load the whole repository in memory and then run our queries.

import json
import os
from functools import reduce

def getFileContent(filePath):
    with open(filePath) as f:
        return json.load(f)


def getFolderContent(folderPath):
    folderPaths = [{'name': f, 'path': os.path.join(folderPath, f)} for f in os.listdir(folderPath) if not os.path.isfile(os.path.join(folderPath, f))]
    filePaths   = [{'name': f, 'path': os.path.join(folderPath, f)} for f in os.listdir(folderPath) if os.path.isfile(os.path.join(folderPath, f))]

    folderContent = [{f['name']: getFolderContent(f['path'])} for f in folderPaths]
    fileContent = [{f['name']: getFileContent(f['path'])} for f in filePaths]

    return reduce(lambda a, b: {**a, **b}, folderContent + fileContent)


smmlRootPath = "D:\OAC-Semantic-Modeller\Repo"  # this is the root path to where I unzipped the SMML export
allRepositoryJson = getFolderContent(smmlRootPath)

# get all Logical Tables in the Repository
logicalTableNames = [lt['logicalTable']['name'] for lt in allRepositoryJson['logical']['HelloBusinessModel'].values()]
print(f"There are {len(logicalTableNames)} Logical Tables in the repository: {', '.join(logicalTableNames)}\n")

# get all Logical Table Columns
ltColumns = []
for ltName in logicalTableNames:
    ltColumns.extend([
        f"{ltName}.{lc['name']} ({lc['dataType']})"
         for lc in allRepositoryJson['logical']['HelloBusinessModel'][ltName + '.json']['logicalTable']['logicalColumns']
    ])

div = "\n\t * "
print(f"There are {len(ltColumns)} Columns found across {len(logicalTableNames)} Logical Tables:{div}{div.join(ltColumns)}")

The script is still quite simple and now allows us to run pretty much any query we can think of against the repository. The (top of the) output looks like this:

The new SMML language allows for a much easier repository content analysis and manipulation with Python - I expect that no big OAC project in the future will go without a Python developer. Development standards checks, change monitoring, development speed measurement, lineage tracing, repetitive repository content generation - these are just a few scripting opportunities that come to mind.

Conclusions
  • The SMML language is a major improvement over the OBIEE's UDML language;
  • It is a good way of quickly editing a repository object in a text editor;
  • Editing SMML is more dangerous than using the standard object designer. The SMML editor will give you warnings if you try to save something dodgy but you can break things;
  • Use Python to run repository-wide queries and updates. It is easier than it was with OBIEE's XML extracts;
  • If planning to use SMML editing extensively, enable git and follow best version control practices - if you do break something, make sure it is your own dev branch.
Categories: BI & Warehousing

Recordings Available!

Jim Marion - Mon, 2022-09-12 13:29

Did you know we record all of our events? Replays are available through our online learning platform. Be sure to check out the webinar collection. Each replay includes sample code downloads, handouts, a Q&A roster, and links to additional resources. Topics include:

  • Integration (REST, SOAP, Cloud, Application Services Framework)
  • Fluid (Fundamentals, Drop Zones, Grids, Navigation, Complex CSS)
  • PeopleCode Application Classes
  • Isolated Customizations and Configuration Alternatives (Drop Zones, Event Mapping, Related Content, Related Actions, Activity Guides, Page and Field Configurator)

Do you prefer interactive training with plenty of hands-on activities? Check out our schedule to see what we are offering next!

Pages

Subscribe to Oracle FAQ aggregator