Updates for 4.14 kernel merge window

- Lots of hfi1 driver updates (mixed with a few qib and core updates as
   well)
 - rxe updates
 - various mlx updates
 - Set default roce type to RoCEv2
 - Several larger fixes for bnxt_re that were too big for -rc
 - Several larger fixes for qedr that, likewise, were too big for -rc
 - Misc core changes
 - Make the hns_roce driver compilable on arches other than aarch64 so we
   can more easily debug build issues related to it
 - Add rdma-netlink infrastructure updates
 - Add automatic IRQ affinity infrastructure
 - Add 32bit lid support
 - Lots of misc fixes across the subsystem from random people
 - Autoloading of RDMA netlink modules
 - PCI pool cleanups from Romain Perier
 - mlx5 driver feature additions and fixes
 - Hardware tag matchine feature
 - Fix sleeping in atomic when resolving roce ah
 - Add experimental ioctl interface as posted to linux-api@
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZqBDtAAoJELgmozMOVy/dNlcQAJhYNRGaNUBx0L6+8t2xwUrt
 7ndP6qlMar30DJY9FjTQCzRBw0CRMWkXdJD8rYlyaHy07pjWDKG8LZtxEXu1FLdZ
 oNRvQX6ZJh8Bz7db2SQFBCTF2uWGZZFqWQCrSbQwjj9xxjMDs59u/knmwHVY9dKk
 egjPG4IQBDmcTeNY7h1otG2hXpx7QPIOilQW2EFN5SWAuBAazdF2JKxjjxqhnUfp
 gD2pSdgsm3VSMoo0zpMa6qOP+9GcOu8J97fYFhasRYWCavPdWHyq+XNu9S/eicRd
 xbv+seCYM+9jPb2dsNdjEKll7w3yyWdu7h6tSCMPYv54eN9sDDiO1w2L2ZnESMZa
 JRnSfB+HXru1r4RyHOTPO8peaNhYlR1V4u8bTS5G2dffbHis9BajkWoAR/oSiUcB
 AIjIIDcdJFVGfpF9KIt/pEl+adHNgESibSijzOUYkyw6RNbPqDmdd7YakPHcQhKN
 clE3zQfIsPRLWsToP/nkBE0tUd3tQocRuLy7ote7hXQK+0p7TBz0a6Kkj87MvX33
 8dVbUI+q6WRlEY90l71y0ZdXy/AvkxkFxAc4Y7FQZyJxhEArTaKgfa5fmpRwVxBm
 yi9baoYCspHNRNv6AO4IL86ZCJqmWBuch8CBY1n2X3h8IGfKYEZUAZ+T/mnTTeUq
 A4joXduz94ZD4w23leD1
 =2ntC
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma updates from Doug Ledford:
 "This is a big pull request.

  Of note is that I'm sending you the new ioctl API for the rdma
  subsystem. We put it up on linux-api@, but didn't get much response.
  The API is complex, but it solves two different problems in one go:

   1) The bi-directional nature of the RDMA file write calls, which
      created the security hole we had to handle (and for which the fix
      is now causing problems for systems in production, we were a bit
      over zealous in the fix and the ability to open a device, then
      fork, then create new queue pairs on the device and use them is
      broken).

   2) The bloat caused by different vendors implementing extensions to
      the base verbs API. Each vendor's hardware is slightly different,
      and the hardware might be suitable for one extension but not
      another.

      By the time we add generic extensions for all the different ways
      that the different hardware can offload things, the API becomes
      bloated. Things like our completion structs have started to exceed
      a cache line in size because of all the elements needed to support
      this. That in turn shows up heavily in the performance graphs with
      a noticable drop in performance on 100Gigabit links as our
      completion structs go from occupying one cache line to 1+.

      This API makes things like the completion structs modular in a
      very similar way to netlink so that your structs can only include
      the items needed for the offloads/features you are actually using
      on a given queue pair. In that way we support everything, but only
      use what we need, and our structs stay smaller.

  The ioctl API is better explained by the posting on linux-api@ than I
  can explain it here, so I'll just leave it at that.

  The rest of the pull request is typical stuff.

  Updates for 4.14 kernel merge window

   - Lots of hfi1 driver updates (mixed with a few qib and core updates
     as well)

   - rxe updates

   - various mlx updates

   - Set default roce type to RoCEv2

   - Several larger fixes for bnxt_re that were too big for -rc

   - Several larger fixes for qedr that, likewise, were too big for -rc

   - Misc core changes

   - Make the hns_roce driver compilable on arches other than aarch64 so
     we can more easily debug build issues related to it

   - Add rdma-netlink infrastructure updates

   - Add automatic IRQ affinity infrastructure

   - Add 32bit lid support

   - Lots of misc fixes across the subsystem from random people

   - Autoloading of RDMA netlink modules

   - PCI pool cleanups from Romain Perier

   - mlx5 driver feature additions and fixes

   - Hardware tag matchine feature

   - Fix sleeping in atomic when resolving roce ah

   - Add experimental ioctl interface as posted to linux-api@"

* tag 'for-linus-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (328 commits)
  IB/core: Expose ioctl interface through experimental Kconfig
  IB/core: Assign root to all drivers
  IB/core: Add completion queue (cq) object actions
  IB/core: Add legacy driver's user-data
  IB/core: Export ioctl enum types to user-space
  IB/core: Explicitly destroy an object while keeping uobject
  IB/core: Add macros for declaring methods and attributes
  IB/core: Add uverbs merge trees functionality
  IB/core: Add DEVICE object and root tree structure
  IB/core: Declare an object instead of declaring only type attributes
  IB/core: Add new ioctl interface
  RDMA/vmw_pvrdma: Fix a signedness
  RDMA/vmw_pvrdma: Report network header type in WC
  IB/core: Add might_sleep() annotation to ib_init_ah_from_wc()
  IB/cm: Fix sleeping in atomic when RoCE is used
  IB/core: Add support to finalize objects in one transaction
  IB/core: Add a generic way to execute an operation on a uobject
  Documentation: Hardware tag matching
  IB/mlx5: Support IB_SRQT_TM
  net/mlx5: Add XRQ support
  ...
This commit is contained in:
Linus Torvalds 2017-09-03 17:49:17 -07:00
commit aa9d4648c2
275 changed files with 14913 additions and 5268 deletions

View file

@ -290,6 +290,7 @@ enum mlx5_event {
MLX5_EVENT_TYPE_GPIO_EVENT = 0x15,
MLX5_EVENT_TYPE_PORT_MODULE_EVENT = 0x16,
MLX5_EVENT_TYPE_REMOTE_CONFIG = 0x19,
MLX5_EVENT_TYPE_GENERAL_EVENT = 0x22,
MLX5_EVENT_TYPE_PPS_EVENT = 0x25,
MLX5_EVENT_TYPE_DB_BF_CONGESTION = 0x1a,
@ -304,6 +305,10 @@ enum mlx5_event {
MLX5_EVENT_TYPE_FPGA_ERROR = 0x20,
};
enum {
MLX5_GENERAL_SUBTYPE_DELAY_DROP_TIMEOUT = 0x1,
};
enum {
MLX5_PORT_CHANGE_SUBTYPE_DOWN = 1,
MLX5_PORT_CHANGE_SUBTYPE_ACTIVE = 4,
@ -968,7 +973,7 @@ enum mlx5_cap_type {
MLX5_CAP_ATOMIC,
MLX5_CAP_ROCE,
MLX5_CAP_IPOIB_OFFLOADS,
MLX5_CAP_EOIB_OFFLOADS,
MLX5_CAP_IPOIB_ENHANCED_OFFLOADS,
MLX5_CAP_FLOW_TABLE,
MLX5_CAP_ESWITCH_FLOW_TABLE,
MLX5_CAP_ESWITCH,
@ -1011,6 +1016,10 @@ enum mlx5_mcam_feature_groups {
MLX5_GET(per_protocol_networking_offload_caps,\
mdev->caps.hca_max[MLX5_CAP_ETHERNET_OFFLOADS], cap)
#define MLX5_CAP_IPOIB_ENHANCED(mdev, cap) \
MLX5_GET(per_protocol_networking_offload_caps,\
mdev->caps.hca_cur[MLX5_CAP_IPOIB_ENHANCED_OFFLOADS], cap)
#define MLX5_CAP_ROCE(mdev, cap) \
MLX5_GET(roce_cap, mdev->caps.hca_cur[MLX5_CAP_ROCE], cap)

View file

@ -162,6 +162,13 @@ enum dbg_rsc_type {
MLX5_DBG_RSC_CQ,
};
enum port_state_policy {
MLX5_POLICY_DOWN = 0,
MLX5_POLICY_UP = 1,
MLX5_POLICY_FOLLOW = 2,
MLX5_POLICY_INVALID = 0xffffffff
};
struct mlx5_field_desc {
struct dentry *dent;
int i;
@ -185,6 +192,7 @@ enum mlx5_dev_event {
MLX5_DEV_EVENT_GUID_CHANGE,
MLX5_DEV_EVENT_CLIENT_REREG,
MLX5_DEV_EVENT_PPS,
MLX5_DEV_EVENT_DELAY_DROP_TIMEOUT,
};
enum mlx5_port_status {
@ -291,7 +299,7 @@ struct mlx5_cmd {
struct semaphore pages_sem;
int mode;
struct mlx5_cmd_work_ent *ent_arr[MLX5_MAX_COMMANDS];
struct pci_pool *pool;
struct dma_pool *pool;
struct mlx5_cmd_debug dbg;
struct cmd_msg_cache cache[MLX5_NUM_COMMAND_CACHES];
int checksum_disabled;
@ -410,6 +418,7 @@ enum mlx5_res_type {
MLX5_RES_SQ = MLX5_EVENT_QUEUE_TYPE_SQ,
MLX5_RES_SRQ = 3,
MLX5_RES_XSRQ = 4,
MLX5_RES_XRQ = 5,
};
struct mlx5_core_rsc_common {
@ -525,6 +534,9 @@ struct mlx5_mkey_table {
struct mlx5_vf_context {
int enabled;
u64 port_guid;
u64 node_guid;
enum port_state_policy policy;
};
struct mlx5_core_sriov {
@ -534,7 +546,6 @@ struct mlx5_core_sriov {
};
struct mlx5_irq_info {
cpumask_var_t mask;
char name[MLX5_MAX_IRQ_NAME];
};
@ -597,7 +608,6 @@ struct mlx5_port_module_event_stats {
struct mlx5_priv {
char name[MLX5_MAX_NAME_LEN];
struct mlx5_eq_table eq_table;
struct msix_entry *msix_arr;
struct mlx5_irq_info *irq_info;
/* pages stuff */
@ -840,13 +850,6 @@ struct mlx5_pas {
u8 log_sz;
};
enum port_state_policy {
MLX5_POLICY_DOWN = 0,
MLX5_POLICY_UP = 1,
MLX5_POLICY_FOLLOW = 2,
MLX5_POLICY_INVALID = 0xffffffff
};
enum phy_port_state {
MLX5_AAA_111
};
@ -1089,7 +1092,7 @@ enum {
};
enum {
MAX_UMR_CACHE_ENTRY = 20,
MR_CACHE_LAST_STD_ENTRY = 20,
MLX5_IMR_MTT_CACHE_ENTRY,
MLX5_IMR_KSM_CACHE_ENTRY,
MAX_MR_CACHE_ENTRIES
@ -1183,4 +1186,10 @@ enum {
MLX5_TRIGGERED_CMD_COMP = (u64)1 << 32,
};
static inline const struct cpumask *
mlx5_get_vector_affinity(struct mlx5_core_dev *dev, int vector)
{
return pci_irq_get_affinity(dev->pdev, MLX5_EQ_VEC_COMP_BASE + vector);
}
#endif /* MLX5_DRIVER_H */

View file

@ -200,6 +200,7 @@ enum {
MLX5_CMD_OP_QUERY_SQ = 0x907,
MLX5_CMD_OP_CREATE_RQ = 0x908,
MLX5_CMD_OP_MODIFY_RQ = 0x909,
MLX5_CMD_OP_SET_DELAY_DROP_PARAMS = 0x910,
MLX5_CMD_OP_DESTROY_RQ = 0x90a,
MLX5_CMD_OP_QUERY_RQ = 0x90b,
MLX5_CMD_OP_CREATE_RMP = 0x90c,
@ -294,8 +295,10 @@ struct mlx5_ifc_flow_table_fields_supported_bits {
u8 inner_tcp_dport[0x1];
u8 inner_tcp_flags[0x1];
u8 reserved_at_37[0x9];
u8 reserved_at_40[0x1a];
u8 bth_dst_qp[0x1];
u8 reserved_at_40[0x40];
u8 reserved_at_5b[0x25];
};
struct mlx5_ifc_flow_table_prop_layout_bits {
@ -431,7 +434,9 @@ struct mlx5_ifc_fte_match_set_misc_bits {
u8 reserved_at_100[0xc];
u8 inner_ipv6_flow_label[0x14];
u8 reserved_at_120[0xe0];
u8 reserved_at_120[0x28];
u8 bth_dst_qp[0x18];
u8 reserved_at_160[0xa0];
};
struct mlx5_ifc_cmd_pas_bits {
@ -599,7 +604,7 @@ struct mlx5_ifc_per_protocol_networking_offload_caps_bits {
u8 rss_ind_tbl_cap[0x4];
u8 reg_umr_sq[0x1];
u8 scatter_fcs[0x1];
u8 reserved_at_1a[0x1];
u8 enhanced_multi_pkt_send_wqe[0x1];
u8 tunnel_lso_const_out_ip_id[0x1];
u8 reserved_at_1c[0x2];
u8 tunnel_statless_gre[0x1];
@ -840,7 +845,7 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 retransmission_q_counters[0x1];
u8 reserved_at_183[0x1];
u8 modify_rq_counter_set_id[0x1];
u8 reserved_at_185[0x1];
u8 rq_delay_drop[0x1];
u8 max_qp_cnt[0xa];
u8 pkey_table_size[0x10];
@ -857,7 +862,7 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 pcam_reg[0x1];
u8 local_ca_ack_delay[0x5];
u8 port_module_event[0x1];
u8 reserved_at_1b1[0x1];
u8 enhanced_error_q_counters[0x1];
u8 ports_check[0x1];
u8 reserved_at_1b3[0x1];
u8 disable_link_up[0x1];
@ -873,7 +878,8 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 max_tc[0x4];
u8 reserved_at_1d0[0x1];
u8 dcbx[0x1];
u8 reserved_at_1d2[0x3];
u8 general_notification_event[0x1];
u8 reserved_at_1d3[0x2];
u8 fpga[0x1];
u8 rol_s[0x1];
u8 rol_g[0x1];
@ -1016,7 +1022,8 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 log_max_wq_sz[0x5];
u8 nic_vport_change_event[0x1];
u8 reserved_at_3e1[0xa];
u8 disable_local_lb[0x1];
u8 reserved_at_3e2[0x9];
u8 log_max_vlan_list[0x5];
u8 reserved_at_3f0[0x3];
u8 log_max_current_mc_list[0x5];
@ -1187,7 +1194,8 @@ struct mlx5_ifc_cong_control_r_roce_ecn_np_bits {
u8 reserved_at_c0[0x12];
u8 cnp_dscp[0x6];
u8 reserved_at_d8[0x5];
u8 reserved_at_d8[0x4];
u8 cnp_prio_mode[0x1];
u8 cnp_802p_prio[0x3];
u8 reserved_at_e0[0x720];
@ -2014,6 +2022,10 @@ enum {
MLX5_QPC_PM_STATE_MIGRATED = 0x3,
};
enum {
MLX5_QPC_OFFLOAD_TYPE_RNDV = 0x1,
};
enum {
MLX5_QPC_END_PADDING_MODE_SCATTER_AS_IS = 0x0,
MLX5_QPC_END_PADDING_MODE_PAD_TO_CACHE_LINE_ALIGNMENT = 0x1,
@ -2057,7 +2069,8 @@ struct mlx5_ifc_qpc_bits {
u8 st[0x8];
u8 reserved_at_10[0x3];
u8 pm_state[0x2];
u8 reserved_at_15[0x7];
u8 reserved_at_15[0x3];
u8 offload_type[0x4];
u8 end_padding_mode[0x2];
u8 reserved_at_1e[0x2];
@ -2437,7 +2450,7 @@ struct mlx5_ifc_sqc_bits {
u8 cd_master[0x1];
u8 fre[0x1];
u8 flush_in_error_en[0x1];
u8 reserved_at_4[0x1];
u8 allow_multi_pkt_send_wqe[0x1];
u8 min_wqe_inline_mode[0x3];
u8 state[0x4];
u8 reg_umr[0x1];
@ -2515,7 +2528,7 @@ enum {
struct mlx5_ifc_rqc_bits {
u8 rlky[0x1];
u8 reserved_at_1[0x1];
u8 delay_drop_en[0x1];
u8 scatter_fcs[0x1];
u8 vsd[0x1];
u8 mem_rq_type[0x4];
@ -2562,7 +2575,9 @@ struct mlx5_ifc_rmpc_bits {
struct mlx5_ifc_nic_vport_context_bits {
u8 reserved_at_0[0x5];
u8 min_wqe_inline_mode[0x3];
u8 reserved_at_8[0x17];
u8 reserved_at_8[0x15];
u8 disable_mc_local_lb[0x1];
u8 disable_uc_local_lb[0x1];
u8 roce_en[0x1];
u8 arm_change_event[0x1];
@ -3000,7 +3015,7 @@ struct mlx5_ifc_xrqc_bits {
struct mlx5_ifc_tag_matching_topology_context_bits tag_matching_topology_context;
u8 reserved_at_180[0x880];
u8 reserved_at_180[0x280];
struct mlx5_ifc_wq_bits wq;
};
@ -3947,7 +3962,47 @@ struct mlx5_ifc_query_q_counter_out_bits {
u8 local_ack_timeout_err[0x20];
u8 reserved_at_320[0x4e0];
u8 reserved_at_320[0xa0];
u8 resp_local_length_error[0x20];
u8 req_local_length_error[0x20];
u8 resp_local_qp_error[0x20];
u8 local_operation_error[0x20];
u8 resp_local_protection[0x20];
u8 req_local_protection[0x20];
u8 resp_cqe_error[0x20];
u8 req_cqe_error[0x20];
u8 req_mw_binding[0x20];
u8 req_bad_response[0x20];
u8 req_remote_invalid_request[0x20];
u8 resp_remote_invalid_request[0x20];
u8 req_remote_access_errors[0x20];
u8 resp_remote_access_errors[0x20];
u8 req_remote_operation_errors[0x20];
u8 req_transport_retries_exceeded[0x20];
u8 cq_overflow[0x20];
u8 resp_cqe_flush_error[0x20];
u8 req_cqe_flush_error[0x20];
u8 reserved_at_620[0x1e0];
};
struct mlx5_ifc_query_q_counter_in_bits {
@ -5229,7 +5284,9 @@ struct mlx5_ifc_modify_nic_vport_context_out_bits {
};
struct mlx5_ifc_modify_nic_vport_field_select_bits {
u8 reserved_at_0[0x16];
u8 reserved_at_0[0x14];
u8 disable_uc_local_lb[0x1];
u8 disable_mc_local_lb[0x1];
u8 node_guid[0x1];
u8 port_guid[0x1];
u8 min_inline[0x1];
@ -5847,6 +5904,28 @@ struct mlx5_ifc_destroy_rq_in_bits {
u8 reserved_at_60[0x20];
};
struct mlx5_ifc_set_delay_drop_params_in_bits {
u8 opcode[0x10];
u8 reserved_at_10[0x10];
u8 reserved_at_20[0x10];
u8 op_mod[0x10];
u8 reserved_at_40[0x20];
u8 reserved_at_60[0x10];
u8 delay_drop_timeout[0x10];
};
struct mlx5_ifc_set_delay_drop_params_out_bits {
u8 status[0x8];
u8 reserved_at_8[0x18];
u8 syndrome[0x20];
u8 reserved_at_40[0x40];
};
struct mlx5_ifc_destroy_rmp_out_bits {
u8 status[0x8];
u8 reserved_at_8[0x18];

View file

@ -560,6 +560,9 @@ int mlx5_core_destroy_qp(struct mlx5_core_dev *dev,
int mlx5_core_qp_query(struct mlx5_core_dev *dev, struct mlx5_core_qp *qp,
u32 *out, int outlen);
int mlx5_core_set_delay_drop(struct mlx5_core_dev *dev,
u32 timeout_usec);
int mlx5_core_xrcd_alloc(struct mlx5_core_dev *dev, u32 *xrcdn);
int mlx5_core_xrcd_dealloc(struct mlx5_core_dev *dev, u32 xrcdn);
void mlx5_init_qp_table(struct mlx5_core_dev *dev);

View file

@ -38,6 +38,7 @@
enum {
MLX5_SRQ_FLAG_ERR = (1 << 0),
MLX5_SRQ_FLAG_WQ_SIG = (1 << 1),
MLX5_SRQ_FLAG_RNDV = (1 << 2),
};
struct mlx5_srq_attr {
@ -56,6 +57,10 @@ struct mlx5_srq_attr {
u32 user_index;
u64 db_record;
__be64 *pas;
u32 tm_log_list_size;
u32 tm_next_tag;
u32 tm_hw_phase_cnt;
u32 tm_sw_phase_cnt;
};
struct mlx5_core_dev;

View file

@ -114,5 +114,6 @@ int mlx5_core_modify_hca_vport_context(struct mlx5_core_dev *dev,
u8 other_vport, u8 port_num,
int vf,
struct mlx5_hca_vport_context *req);
int mlx5_nic_vport_update_local_lb(struct mlx5_core_dev *mdev, bool enable);
int mlx5_nic_vport_query_local_lb(struct mlx5_core_dev *mdev, bool *status);
#endif /* __MLX5_VPORT_H__ */